Baby Monitors

How Smart Baby Monitors Work & 10 Things they Detect

Sandra W Bullock

This website is reader-supported. When you click on links, we may earn a small commission at no additional cost to you.

Are you curious about how smart baby monitors work to make your baby safer with superior tracking technologies? In this article, I delve into the fascinating world of smart baby monitors. Discover how these innovative devices utilize cutting-edge technologies like Artificial Intelligence, Machine Learning, and Deep Learning.

By harnessing the power of these algorithms, these monitors enhance your baby’s safety while minimizing the need for intrusive wearable tech. Join me as we explore the intersection of technology and childcare in this enlightening piece.

What Smart Monitors Can Detect: 10 Things they Detect:

These recent innovations utilizing AI and image processing techniques can automatically identify unsafe situations that a baby may be in. The monitoring device can detect various conditions, including;

  1. Unsafe face covering that could cause suffocation hazards
  2. Unsafe sleep positions such as sleeping on the stomach
  3. Unsafe clothing that is hazardous to baby
  4. Unsafe baby sleep patterns measuring sleep times and awake periods
  5. Dangerous volatile organic compounds
  6. Unsafe room temperature and humidity ranges
  7. Breathing and alert when no breathing is detected
  8. Oxygen saturation levels
  9. Heartbeat rate
  10. Cry detection

How Smart Baby Monitors Work:

Some of the latest monitoring cameras have AI-powered technologies such as computer vision used by Nanit Pro and sensor fusion used by Miku. This automatic understanding of data captured by the baby monitor cameras is something very recent as most traditional cameras only relaid audio, and video with limited sensing capabilities.

A 2021 Study by Tariq Khan highlights how intelligent smart baby monitors using Deep Learning models can decipher pose positions and face and landmark detection among other capabilities.

Deep learning models are cutting-edge technologies that have evolved since the 2010s, building upon the foundations of machine learning algorithms dating back to the 1980s. It is worth noting that Artificial Intelligence, the overarching field that gave rise to machine learning and subsequently deep learning, has its roots as far back as the 1950s.

The chart below by Nvidia shows the progress from early AI to current deep-learning models.

Deep_Learning_and AI for baby safety with baby monitors

Deep learning breakthroughs have enabled the use of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) models in smart baby monitors. These advanced neural network algorithms can efficiently process vast amounts of visual data captured by the monitor cameras, resulting in accurate identification of potential hazards to babies.

As an illustration, the baby’s nose can be detected from an image to determine if the face is covered, possibly due to sleeping on the stomach or other factors. Similarly, to detect the removal of a blanket, a smart monitor relies on the visibility of lower body parts such as the knee, ankle, and hip.

Below is a sample pseudocode I found from Tariq’s Study for a deep learning model that detects face coverings and identifies when a baby’s blankets are removed.

Pseudocode for deep learning models in smart baby monitors

To detect the sleeping position of a baby using smart monitors, a pose detection system is employed. This system has deep-learning cameras that leverage its library of over 250,000 people with over 56,000 of the people having their body parts such as nose, ear, mouth, etc labeled.

By processing thousands of micro-images, the cameras can provide accurate information about whether the baby is sleeping with their face down. After cross-referencing with COCO datasets, the cameras can confidently determine if they are detecting the back of the baby’s head rather than their face. This advanced technology ensures reliability in identifying the baby’s sleeping position. COCO dataset is an open source tool which you can learn more about on COCO’s website here.

To detect when baby is crying, smart baby monitors rely on advanced audio processing algorithms. Some models also rely on image processing to detect facial expressions that may indicate the baby is crying or in distress.

These algorithms use machine learning and artificial intelligence techniques to analyze audio signals, identifying patterns that are indicative of a baby’s cry. The models learned from thousands of recordings, making them highly accurate in detecting when the baby is crying.

Here is a short video demo of a cry-detecting camera.

Another way to detect body parts when deep learning algorithms used by smart monitors use is by using body heat maps. There are 18 heat maps associated with each body part. Part Affinity Fields(PAF) are used to associate body parts and infer their connectivity. For example, the nose is connected to the eyes, while the mouth is connected to the ears. These heat maps combined with a 3D pose estimation model enable the camera to accurately identify and track body parts in real time.

Some more advanced models even utilize known temperature sensitivity data for different parts of the body. For example, the baby’s head is known to emit more heat compared to other body parts. By analyzing this data, the smart monitor can accurately identify the baby’s head and track its movements.

Parts Affinitity Fields Data to detect baby body parts using deep learning

See full research here with thermal sensitivity data.

To detect temperature, humidity and movement, baby monitors use sensors. These devices come in different shapes and sizes but the basic working principle is the same.

Baby monitors commonly use RTD (Resistance Temperature Detector) temperature sensors. RTDs are variable resistors that change their electrical resistance in direct response to temperature variations. The sensing capability of RTDs is highly reliable, precise, and exhibits near-linear behavior. For example, if an RTD has a resistance of 100Ω at 0°C, it may have a resistance of 138.5 Ω at 50°C.

In addition to temperature sensors, humidity sensors are also commonly used in baby monitors. These sensors measure the amount of water vapor present in the air and are essential for monitoring the humidity levels in your baby’s room. Humidity sensors can be capacitive, resistive or thermal-based and are highly sensitive to changes in humidity.

To detect volative organic compounds (VOCs) and other harmful gases, some baby monitors also use gas sensors. These sensors can detect the presence of chemicals in the air and alert parents if there are any dangerous levels present in the room. A common gas sensor used in baby monitors is the electrochemical sensor which operates by using a chemical reaction to produce an electric current that can be measured and converted into a gas concentration reading.

To detect breathing, smart monitors such as Nanit utilize image processing and machine learning algorithms to track movements through video footage. This is how this works: the camera captures footage of the baby’s chest movement and then analyzes it to identify any changes in movement patterns. If there are no movements detected for a certain period, an alert is sent to the parent’s device.

Smart monitors utilize eye landmark images to detect a baby’s awake and sleeping times. Thousands of these images are used to train a Convolutional Neural Network (CNN) that operates in the background of a camera device. By analyzing eye blinking, facial postures, and other cues, the algorithm accurately determines whether the baby is asleep or awake.

Below is a typical flow chart for awake detection in sleep monitoring cameras;

awake detection flow chart in smart monitors

The Eye Aspect Ratio (EAR) is calculated for every captured image to assess the wakefulness of the baby. A larger EAR indicates that the eye is open, while a smaller EAR indicates that the eye is closed. The average EAR of both the left and right eyes is then used to determine the final EAR value. It’s a bit more complicated than this explanation. For example, the AI calculates the distance between different parts of the eyelids and determines the threshold distance for ‘sleep state’ vs ‘awake state’. As Tariq highlighted in his study, the threshold is usually 0.25.

Below is an image of an eye with the various labeled parts that the deep learning algorithm employed by Miku, Nanit, and Cubo Ai use to determine the wakefulness of the baby.

Parts of eyelids used to calculate wakefulness by smart monitors

To determine heart beat rate or oxygen saturation levels monitors such as Owlet use pulse oximetry technology. This technology measures the oxygen levels and heart rate of a baby by UV shining light through their skin to detect changes in blood flow. A sensor is placed on the baby’s foot or a clip-on device is attached to their clothing while they sleep.

Pulse oximetry technology utilizes the interaction between hemoglobin and various wavelengths of light. Hemoglobin, the primary protein in red blood cells responsible for transporting oxygen throughout the body, exists in two forms: oxyhemoglobin and deoxyhemoglobin. By leveraging this understanding, pulse oximetry technology provides valuable insights into oxygen saturation levels in the body.

The sensor emits red and infrared light, which can pass through soft tissues like skin but are absorbed by hard substances such as bones or teeth. Blood absorbs both wavelengths differently depending on its saturation level – oxygen-rich blood absorbs more infrared light while oxygen-poor blood absorbs more red light. By analyzing the ratio between absorbed red and infrared lights, monitors such as Owlet can determine heart rate and oxygen saturation levels.

Heart rate is related to oxygen saturation as well. When the oxygen levels in the blood decrease, the heart rate increases to pump more oxygen-rich blood throughout the body. Conversely, when oxygen levels increase, the heart rate decreases as it does not need to work as hard to circulate enough oxygen.

Owlet measures heartbeat per minute using a pulse oximeter. A pulse oximeter detects heart rate by analyzing the pulse – the change in blood volume with each heartbeat. It does so by using a sensor that emits light and detects how much of it passes through the fingertip at various wavelengths.

On the other hand, advanced monitors like Nanit use image processing to monitor a baby’s breathing. These monitors employ RGB-D Structured Light (SL) cameras, which are sensing systems that capture RGB images and depth information for each pixel. Specifically, Nanit, Miku and Cubo Ai cameras project a known pattern onto the baby, and the distortion in the projected pattern encodes depth information.

The working principle of these cameras is based on triangulation, where the depth can be calculated by considering the angle between the emitter and the sensor. The cameras then analyze the captured images to determine if there is any movement in the breathing pattern of the baby. This 2014 Study has details on the working of RGB-D cameras to detect respiratory changes.