With the Intel® Distribution of OpenVINO™ toolkit, the Intel® Neural Compute Stick 2 empowers deep learning developers to profile, tune, and deploy convolutional neural networks (CNNs) on low-power applications that require real-time inferencing. Advance the rapid development of high-performance computer vision solutions to enable fast, efficient deep learning workloads on Intel® platforms.
The Intel® NCS 2 is a USB stick with a dedicated neural network inference accelerator. With the Intel® Distribution of OpenVINO™ toolkit, the Intel® NCS 2 can offer the following:
- Throughput or affordability challenges
- Next wave of innovation
The Intel® Distribution of OpenVINO™ toolkit supports Half Precision Floating Point (FP16).
Half-precision floating point (FP16)
The emergence of small and compact hardware form factors for running computer vision applications has begun.
Intel has released the USB stick based Intel® NCS 2, which is essentially a Vision Processing Unit (VPU).
FP16 can reduce the number of bits in half, reducing the exponent from 8 bits to 5, and the mantissa from 23 bits to 10.
While GPU CPU and APIs support single precision or FP32 instructions natively, the extra precision provided by this representation does not necessarily also provide notable extra classification accuracy compared to half-precision or FP16. FP16 on the other hand does cut the number of bits required for storage in half, reducing the exponent from 8 bits to 5, and the mantissa from 23 bits to 10.
Additional information about 16 Float Point and 32 Float Point are found in the table below.
Using FP16 enables developers to train and run inference on deep learning models fast.
Table 1. 16 Float Point and 32 Float Point
FP 16 | FP 32 | |
Most weights and gradients fall in the 16-bit FP range. For deep learning, in most cases, we don’t really need all the precision or magnitude (FP32). |
Range can represent numbers smaller and larger than what you need. | |
For the gradients that do not fall in the 16 bit range – scaling the gradient up works to achieve convergence. | Enough precision to distinguish numbers. | |
FP16 can reduce the number of bits in half, reducing the exponent from 8 bits to 5, and the mantissa from 23 bits to 10. Exponent (magnitude) = 8 bits to 5 Mantissa (precision) = 23 bits to 10 |
32FP reserves 8 bits for the magnitude and 23 bits for the precision. Most neural networks do not need all that precision or magnitude. |
Why is this important?
One challenge with computer vision, especially with prototype boards, is having enough power to develop machine vision applications. Vision accelerators such as Intel® NCS 2, enable developers to bring products quickly to market.
- Rapid prototyping with an accelerator.
- Low Power Consumption – The Intel® NCS 2 is a low power device designed to run on USB 2.0 or 3.0. The board, a Raspberry Pi*, for example, will supply power to the USB port – while the Pi is powered by micro USB.
- With its small form factor, developers can add this accelerator to their development boards such as UP Squared* board and use the Intel® Distribution of OpenVINO™ toolkit with out of the box FP16 pre-trained models for prototyping solutions involving detection, recognition, and segmentation.
- Low cost of hardware: Intel® NCS 2.
Use Cases for Intel® Neural Compute Stick 2 half-precision Floating Point
There are several pre-trained models optimized to use FP16. For more information about available pre-trained models, visit the Pre-trained Models page.
Half-precision floating point (FP16) Reference Implementations that can be deployed on Intel® NCS 2 to address various vertical use cases such as Digital Security and Surveillance (DSS), Retail, and Industrial Smart Factory are featured in the reference implementations below.
Table 2. Pre-Built Projects: Open Source Reference Implementations
Open Source Reference Implementations |
Use Cases |
|
Intruder Detector Build an application that alerts you when someone enters a restricted area. Learn how to use models for multi-class object detection. |
Record and send alerts on activity in controlled spaces | |
Machine Operator Monitor Send notifications when an employee appears to be distracted when operating machinery. Google Go* Machine Operator Monitor |
|
|
Restricted Zone Notifier Secure work areas and send alerts if someone enters the restricted space. Python Restricted Zone Notifier Go Restricted Zone Notifier |
|
|
Shopper Gaze Monitor Build a solution to analyze customer expressions and reactions to product advertising collateral that is positioned on retail shelves. |
|
|
Shopper Mood Monitor Detect the mood of shoppers when looking at a retail or kiosk display. Go Shopper Mood Monitor |
|
|
Store Traffic Monitor Monitor three different streams of video that count people inside and outside of a facility. This application also counts product inventory. |
|
|
Parking Lot Tracker Receive or post information on available parking spaces by tracking how many vehicles enter and exit a parking lot. Go Parking Lot Counter |
|
Conclusion
The Intel® Neural Compute Stick 2 is a cost effective, low power, portable solution for prototyping to create simple solutions that can be scaled. The Intel® Distribution of OpenVINO™ toolkit supports Half Precision Floating Point (FP16). Use the Intel® Neural Compute Stick 2 with pre-trained FP16 models.
Further reading and experimentations
- Complete Intel® Distribution of OpenVINO™ toolkit Installation Guides for Linux*, Windows*, and Raspbian
- Five Easy Steps to Deploy the Intel® Distribution of OpenVINO™ toolkit
- Other Intel® Distribution of OpenVINO™ toolkit code samples
- OpenVINO™ toolkit Open Model Zoo
- Migrate NCSDK Applications to Intel® Distribution of OpenVINO™ toolkit
- Optimize Networks for the Intel® Neural Compute Stick (Intel® NCS 2) Device
- Community Forum and Technical Support