Human Detection System Using RaspberryPi, Thermal Camera and Machine Learning

Triggering reliable events based on the presence of people has been the dream of many geeks and DIY automators for a while. Having your house to turn the lights on or off when you enter or exit your living room is an interesting application, for instance. Most of the solutions out there to solve these kinds of problems, even more high-end solutions like the Philips Hue sensors, detect motion, not actual people presence — which means that the lights will switch off once you lay on your couch like a sloth.

The ability to turn off music and/or tv when you exit the room and head to your bedroom, without the hassle of switching all the buttons off, is also an interesting corollary. Detecting the presence of people in your room while you’re not at home is another interesting application.

Thermal cameras coupled with deep neural networks are a much more robust strategy to actually detect the presence of people. Unlike motion sensors, they will detect the presence of people even when they aren’t moving. And, unlike optical cameras, they detect bodies by measuring the heat that they emit in the form of infrared radiation, and are therefore much more robust — their sensitivity doesn’t depend on lighting conditions, on the position of the target, or the colour.

Before exploring the thermal camera solution, I tried for a while to build a model that instead relied on optical images from a traditional webcam. The differences are staggering: I trained the optical model on more than ten thousands 640x480 images taken all through a week in different lighting conditions, while I trained the thermal camera model on a dataset of 900 24x32 images taken during a single day.

Even with more complex network architectures, the optical model wouldn’t score above a 91% accuracy in detecting the presence of people, while the thermal model would achieve around 99% accuracy within a single training phase of a simpler neural network.

Despite the high potential, there’s not much out there in the market — there’s been some research work on the topic (if you google “people detection thermal camera” you’ll mostly find research papers) and a few high-end and expensive products for professional surveillance. In lack of ready-to-go solutions for my house, I decided to take on my duty and build my own solution — making sure that it can easily be replicated by anyone.

Hardware

A RaspberryPi (cost: around $35). In theory any model should work, but it’s probably not a good idea to use a single-core RaspberryPi Zero for machine learning tasks — the task itself is not very expensive (we’ll only use the Raspberry for doing predictions on a trained model, not to train the model), but it may still suffer some latency on a Zero. Any more performing model should do the job well.
A thermal camera. For this project, I’ve used the MLX90640 Pimoroni breakout camera (cost: $55), as it’s relatively cheap, easy to install, and it provides good results. This camera comes in standard (55°) and wide-angle (110°) versions. I’ve used the wide-angle model as the camera monitors a large living room, but take into account that both have the same resolution (32x24 pixels), so the wider angle comes with the cost of a lower spatial resolution. If you want to use a different thermal camera there’s not much you’ll need to change, as long as it comes with a software interface for RaspberryPi and it’s compatible with platypush.
If you used a breakout camera I personally advise to install it on something like the Breakout Garden (cost: $10-14), as it makes it easy to install it just on top of your RaspberryPi with no need for soldering.

Setting up the MLX90640 on your RaspberryPi if you have a Breakout Garden it’s easy as a pie. Fit the Breakout Garden on top of your RaspberryPi. Fit the camera breakout into an I2C slot. Boot the RaspberryPi. Done.

Software

I tested my code on Raspbian, but with a few minor modifications it should be easily adaptable to any distribution installed on the RaspberryPi.

The software support for the thermal camera requires a bit of work. The MLX90640 doesn’t come (yet) with a Python ready-to-use interface, but a C++ open-source driver is provided for it.

Instructions to install it:

# Install the dependencies
[sudo] apt-get install libi2c-dev

# Enable the I2C interface
echo dtparam=i2c_arm=on | sudo tee -a /boot/config.txt

# It's advised to configure the SPI bus baud rate to
# 400kHz to support the higher throughput of the sensor
echo dtparam=i2c1_baudrate=400000 | sudo tee -a /boot/config.txt

# A reboot is required here if you didn't have the
# options above enabled in your /boot/config.txt
[sudo] reboot

# Clone the driver's codebase
git clone https://github.com/pimoroni/mlx90640-library
cd mlx90640-library

# Compile the rawrgb example
make clean
make I2C_MODE=LINUX examples/rawrgb

If it all went well you should see an executable named

rawrgb

under the examples directory. If you run it you should see a bunch of binary data — that’s the raw binary representation of the frames captured by the camera.

Remember where it is located or move it to a custom bin folder, as it’s the executable that platypush will use to interact with the camera module.

This post assumes that you have already installed and configured platypush on your system. If not, head to my post on getting started with platypush, the readthedocs page, the GitHub page or the wiki.

You’ll need the following Python dependencies on the RaspberryPi as well:

# For machine learning image predictions
pip install opencv opencv-contrib-python

# For image manipulation in the MLX90640 plugin
pip install Pillow

In this example we’ll use the RaspberryPi for the capture and prediction phases and a more powerful machine for the training phase. You’ll need the following dependencies on the machine you’ll be using to train your model:

# For image manipulation
pip install opencv

# Install Jupyter notebook to run the training code
pip install jupyterlab

# Then follow the instructions at https://jupyter.org/install
# Tensorflow framework for machine learning and utilities
pip install tensorflow numpy matplotlib

# Clone my repository with the image and training utilities
# and the Jupyter notebooks that we'll use for training
git clone https://github.com/BlackLight/imgdetect-utils

Capturing phase

Now that you’ve got all the hardware and software in place, it’s time to start capturing frames with your camera and use them to train your model. First, configure the MLX90640 plugin in your platypush configuration file:

camera.ir.mlx90640:
    fps: 16      # Frames per second
    rotate: 270  # Can be 0, 90, 180, 270
    rawrgb_path: /path/to/your/rawrgb

Restart platypush. If you enabled the HTTP backend you can test if you are able to take pictures:

curl -XPOST -H 'Content-Type: application/json' \
     -d '{"type":"request", "action":"camera.ir.mlx90640.capture", "args": {"output_file":"~/snap.png", "scale_factor":20}}' \
      http://localhost:8008/execute?token=...

The thermal picture should have been stored under

~/snap.png

In my case it looks like this while I’m in front of the sensor:

Notice the glow at the bottom-right corner — that’s actually the heat from my RaspberryPi 4 CPU. It’s there in all the images I take, and you may probably see similar results if you mounted your camera on top of the Raspberry itself, but it shouldn’t be an issue for your model training purposes.

If you open the webpanel (http://your-host:8008) you’ll also notice a new tab, represented by the sun icon, that you can use to monitor your camera from a web interface.

You can also monitor the camera directly outside of the webpanel by pointing your browser to

http://your-host:8008/camera/ir/mlx90640/stream?rotate=270&scale_factor=20

Now add a cronjob to your platypush configuration to take snapshots every minute:

cron.ThermalCameraSnapshotCron:
    cron_expression: '* * * * *'
    actions:
        -
            action: camera.ir.mlx90640.capture
            args:
                output_file: "${__import__('datetime').datetime.now().strftime('/img/folder/%Y-%m-%d_%H-%M-%S.jpg')}"
                grayscale: true

The images will be stored under

/img/folder

in the format

YYYY-mm-dd_HH-MM-SS.jpg

. No scale factor is applied — even if the images will be tiny we’ll only need them to train our model.

Also, we’ll convert the images to grayscale — the neural network will be lighter and actually more accurate, as it will only have to rely on one variable per pixel without being tricked by RGB combinations.

Restart platypush and verify that every minute a new picture is created under your images directory. Let it run for a few hours or days until you’re happy with the number of samples. Try to balance the numbers of pictures with no people in the room and those with people in the room, trying to cover as many cases as possible — e.g. sitting, standing in different points of the room etc.

As I mentioned earlier, in my case I only needed less than 1000 pictures with enough variety to achieve accuracy levels above 99%.

Labelling phase

Once you’re happy with the number of samples you’ve taken, copy the images over to the machine you’ll be using to train your model (they should be all small JPEG files weighing under 500 bytes each).

Copy them to the folder where you have cloned my imgdetect-utils repository:

BASEDIR=~/git_tree/imgdetect-utils

# This directory will contain your raw images
IMGDIR=$BASEDIR/datasets/ir/images

# This directory will contain the raw numpy training
# data parsed from the images
DATADIR=$BASEDIR/datasets/ir/data
mkdir -p $IMGDIR
mkdir -p $DATADIR

# Copy the images
scp pi@raspberry:/img/folder/*.jpg  $IMGDIR

# Create the labels for the images. Each label is a
# directory under $IMGDIR
mkdir $IMGDIR/negative
mkdir $IMGDIR/positive

Once the images have been copied and the directories for the labels created, run the label.py script provided in the repository to interactively label the images:

cd $BASEDIR
python utils/label.py -d $IMGDIR --scale-factor 10

Each image will open in a new window and you can label it by typing either 1 (negative) or 2 (positive):

At the end of the procedure the negative and positive directories under the images directory should have been populated.

Training phase

Once we’ve got all the labelled images it’s time to train our model. A

train.ipynb

Jupyter notebook is provided under

notebooks/ir

and it should be relatively self-explanatory:

If you managed to execute the whole notebook correctly you’ll have a file named

ir.pb

under

models/ir/tensorflow

. That’s your Tensorflow model file, you can now copy it over to the RaspberryPi and use it to do predictions:

scp $BASEDIR/models/ir/tensorflow/ir.pb pi@raspberry:/home/pi/models

Detect people in the room

Replace the content of the

ThermalCameraSnapshotCron

we previously created with a logic that takes pictures at scheduled intervals and uses the model we have just trained to predict if there are people in the room or not, using the platypush

ml.cv

plugin.

You can implement whichever logic you like in procedure.people_detected and procedure.no_people_detected. These procedures will only be invoked when there is a status change from the previous observation.

For example, a simple logic to turn on or off your lights when someone enters/exits the room:

procedure.sync.people_detected:
    - action: light.hue.on

procedure.sync.no_people_detected:
    - action: light.hue.off

What’s next?

That’s your call! Feel free to experiment with more elaborate rules, for example to change the status of the music/video playing in the room when someone enters, using platypush media plugins. Or say a custom good morning text when you first enter the room in the morning. Or build your own surveillance system to track the presence of people when you’re not at home. Or enhance the model to detect also the number of people in the room, not only the presence.

Or you can combine it with an optical flow sensor, distance sensor, laser range sensor or optical camera (platypush provides plugins for some of them) to build an even more robust system that also detects and tracks movements or proximity to the sensor, and so on.

Also, we used a vanilla neural network in this example, given the small size (24x32) of our samples and the fact that in most of the cases detecting a source of heat in an infrared camera image is a relatively easy task.

You can however go full-on with a convolutional neural network (CNN) to detect more nuances, and that may definitely help more if you decide to use optical camera images.

Previously published at https://towardsdatascience.com/detecting-people-with-a-raspberrypi-a-thermal-camera-and-machine-learning-376d3bbcd45c