In the previous post, we installed OpenCV* for Python*, detected faces via the webcam, and saved data for later analysis. Face detection can be part of a spaceship's AI, but what if the passengers are not looking into camera?
Assuming the spacecraft is equipped with artificial gravity, passengers will be positioned upright most of the time, allowing use of a convenient detector built into OpenCV. To calculate the maximum number of astronauts in a scene at a given time, we will use Histograms of Gradients (HoG) developed by Dalal and Triggs. We will also use a machine learning algorithm known as support vector machines to classify the features detected by HoG.
For a complete description of HoG parameters with examples, check out HOG detectMultiScale parameters explained.
Detecting people or astronauts with HoG. Gradient image from paper by Dalal and Triggs. Astronaut image courtesy of nasa.gov.
Create a Dataset
To develop our algorithm, we need a dataset. The Max-Planck Institute provides a small sample sequence of images of pedestrians which you can download here (26 MB). Other videos can easily be found online. Downloading videos is easy with Youtube-dl. Unix* users can install youtube-dl:
sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl sudo chmod a+rx /usr/local/bin/youtube-dl
Windows users can install it with sudo -H pip install --upgrade youtube-dl. Then run youtube-dl [youtube-url] to download the video.
Extract images from the videos using OpenCV's VideoCapture method as shown in this gist. Or better yet, create a dataset from a webcam using VideoWriter to record videos. In your script, after imports, define the codec and create a VideoWriter object:
fourcc = cv2.VideoWriter_fourcc(*'XVID) out = cv2.VideoWriter('output.avi',fourcc, 20.0 (640,480))
After capturing the frame, save it to a video by adding out.write(frame). Finally, after the while loop is completed, release the capture with cap.release(). Read more about it at OpenCV's guide to videos.
Implement HoG with SVM
To reduce the number of false positives detected, we additionally use non-maximum suppression (NMS). NMS helps with reducing overlapping boxes into a single bounding box. See a comparison below for people detection without and with NMS:
Note the number of false positives (extra boxes) is reduced using NMS, however the number of false negatives (undetected people) increased. The detector can be improved by modifying the “threshold” parameter (up to 1.0) until the correct number of people are detected.
We can track the positions of people present for a variety of space travel purposes.
Note Algorithms like HoG can be optimized for many CPUs, FPGAs, IPUs, and VPUs (i.e. Intel® Movidius™ Neural Compute Stick) on a variety of platforms by using the new Intel OpenVINO™ toolkit.
Learn more: Check out Adrian Rosebrock's informative introduction to implementing HoG for images.
- Track the number of people in each part of the scene (Basic)
- Use OpenCV's optical flow method to analyze the overall motion in sequence (Intermediate)
- Predict trajectories of people in a screen based on motion (Advanced)
In Part 3 of this tutorial series you will learn how to implement a deep neural network designed to estimate the posture of multiple people in an image using the Intel® AI DevCloud.
Other Parts of these Series
How Can a Spacecraft Use AI?
If you missed Part 1: Computer Vision Introduction and Face Detection you can read it here.
About our Author:
Master student and research assistant, University of Osnabrueck, Germany
Justin is an AI Master thesis worker at Peltarion researching deep learning model introspection. He develops AI software as an Intel Software Innovator and demos his projects at Intel’s booths at NIPS, ICML, and CVPR. He previously worked as a neuroscientist in the US.
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.