How does computer vision work?

Computer vision combines cameras, edge computing, software, and AI to enable systems to perform tasks such as image classification and object detection. Computer vision applications use deep learning to help teach a computer to recognize aspects of an image or video and make predictions about them.

Computer Vision AI

Computer Vision AI Takeaways

Computer vision is a type of AI that enables computers and systems to act on insights derived from images and video data.
Organizations are applying computer vision to a range of use cases to unlock improved automation, efficiency, and value.
Intel provides powerful open source software tools to help developers and data scientists protect and simplify deployments across distributed systems and heterogeneous architectures.
Choosing the right hardware for your computer vision initiative depends on several factors. In this article, find hardware recommendations for three key edge use cases.

Computer vision AI empowers businesses to train computers to act on key insights derived from data collected from multiple visual touchpoints. Successfully training and deploying inferencing models requires the right combination of hardware and software depending on your use case demands.

What Is Computer Vision?

All organizations strive for business, brand, and process improvements to gain competitive differentiation—from delivering an exceptional customer experience to eliminating production line impedances. But it’s not humanly possible to identify where and when these improvements are needed. Computer vision technology, a type of artificial intelligence (AI), is being used by organizations to train computers to monitor their business from multiple touchpoints and make sense of the overwhelming amounts of visual data being collected.

Computer vision AI combines a variety of components to enable systems to “see” data collected from cameras and videos, including cameras, edge computing, cloud-based computing, software, and deep learning models used to train the system to recognize aspects of an image or video and make predictions about them. Types of computer vision models include:

Image classification for inspecting an image and assigning it a class label based on the content. For example, an image classification model can be used to predict which images contain a dog, cat, or angry customer.
Image segmentation for identifying objects and extracting them from their background, such as isolating a tumor from surrounding brain tissue in X-ray results.
Object detection for scanning images or videos and finding target objects. Object detection models commonly highlight multiple objects simultaneously and can be used for tasks such as identifying items on shelves for improved inventory management or anomalies in items on a production line.
Feature extraction for isolating useful characteristics captured in an image or video and sharing them with a second AI algorithm, such as search and retrieve image matching. For example, feature extraction can be used to automate traffic monitoring and incident detection.

Computer vision is enabling a range of new use cases, helping companies across industries reduce operational costs, unlock business automation, and create new services or revenue streams. Here are just a few examples:

Medical imaging: GE Healthcare launched an application that uses computer vision AI algorithms to detect critical findings in chest X-rays, including a life-threatening lung condition called pneumothorax.
Smart retail store solutions: Smart cameras monitoring shelves in retail stores can track inventory data and instantly notify staff when an item is out of stock. Computer vision embedded in digital signage enables retailers to measure which types of customers looked at which marketing messages, allowing stores to improve the effectiveness of in-store signage. Grocery store chain Town Talk Foods used AI video analytics to optimize marketing, operations, and merchandizing, enabling them to reach their annual sales goal 17 percent faster.¹
Athletic movement tracking: Intel created an AI platform that scans videos of athletes taken on mobile phones and extracts key information about an athlete’s form and motion, helping athletes and coaches make key adjustments.

Identifying the Right Hardware and Software Technologies for Computer Vision Applications

Given the transformative benefits of computer vision, many organizations are looking to leverage this technology within multiple units of the business. Adopting computer vision solutions requires training or fine-tuning computer vision AI models—fueling them with data to enable advanced capabilities—as well as deploying the AI vision workload wherever it is needed.

Ultimately, enabling an impactful, cost-effective, scalable computer vision solution requires the right mix of AI hardware and software tools, selected carefully based on business and technology requirements. Let’s examine how you can use the Intel® portfolio to meet the needs of virtually any computer vision use case while complying with your business requirements.

Accelerating Development and Data Science with Intel® Tools and Optimizations

Computer vision deployment can entail a considerable lift from developers and data scientists. To help streamline model development and deployment and optimize performance, Intel offers end-to-end AI pipeline software, including optimizations for popular frameworks such as TensorFlow, PyTorch, and scikit-learn.

Additionally, we offer a portfolio of developer resources to help dramatically simplify deployment, including the Intel® Distribution of OpenVINO™ toolkit, which enables your team to write AI solution code once and then deploy it virtually anywhere. As an open source framework, OpenVINO™ allows you to avoid vendor lock-in and build applications that can seamlessly scale across heterogeneous hardware, from edge to cloud.

To make it even easier to unlock faster time to value, we’re also pioneering Intel® Geti™, an open source, enterprise-class computer vision platform that enables noncoder domain experts to collaborate with data scientists to quickly build and train AI models.

Combined with our broad portfolio of hardware, Intel open software tools can streamline the AI journey from concept to production while ensuring the performance you need and accelerating ROI. The power of a combined Intel® AI computer vision platform enables you to address all aspects of the AI pipeline with confidence.

Click here to explore our entire portfolio of data science and developer tools.

Model Training and Deployment: Choosing Hardware Optimized for Your Needs

Because computer vision applications are diverse, infrastructure needs vary greatly based on the problem you’re trying to solve, where data is trained and analyzed, and the size of the workload. In this section, we share three key questions to consider as you select the best hardware for your use cases.

Consideration 1: Where Will Computer Vision Model Training Happen?

To identify the optimal hardware for your training workload, first consider whether your AI strategy and use case require model training or fine-tuning capabilities at the edge, in the cloud and data center, or both. Cost or security constraints may prevent you from training data in the cloud. Additionally, you may be able to take advantage of off-peak cycles available on your edge servers, allowing you to train where the data is generated.

For lightweight training at the edge, such as fine-tuning models, we recommend a server-class Intel® Xeon® Scalable processor with Intel® Data Center GPUs.

For training or deploying large data sets or if you need to train models quickly at the edge, your workload may require additional infrastructure or cloud-based training and inferencing. We recommend considering a distributed training model by pairing AI workloads with on-premises servers. Using multiple Intel® Xeon® Scalable processors with built-in AI acceleration can unlock efficient and cost-effective training without relying on a GPU.

For very large workloads, you can also take an advanced deep learning accelerator approach using Habana® Gaudi® or discrete data center GPUs such as the Intel® Data Center GPU Flex 140 or 170.

For example, Mobileye applications in autonomous vehicles use computer vision to detect and respond to pedestrians, other vehicles, and traffic signals. To do so, applications need to process hundreds of images per second and must run massive models on a continual basis, making training a key contributor to operational costs. To increase training efficiency, Mobileye used Habana® Gaudi® to train models in the cloud, improving price performance by as much as 40 percent².

For information on model training, get started here:

Consideration 2: Do You Need a Rugged Form Factor for Deployment?

Another critical consideration for selecting hardware for your computer vision solution is whether you’re deploying in a traditional IT environment or a location with unique environmental challenges. Demanding environments such as industrial factory floors, warehouses, cell towers, or watercraft require rugged devices that are protected against dust, vibration, extended temperatures, and other harsh conditions. Intel offers a range of IoT and embedded processors with integrated GPUs for rugged and small form factor devices as well as general-purpose server CPUs for standard environments and workloads that require more throughput.

Consideration 3: During Operation, Do You Need to Analyze and Process Video Data at the Edge?

If you need to process a large volume of image or video data, it may be too costly to upload data to the cloud, or regulatory or security requirements may prevent you from sending data to the cloud. You could also be facing latency requirements that prohibit cloud processing. Additionally, you may need your model to continue processing at the edge when the cloud is inaccessible or may need your model to analyze data and react quickly.

If your initiative requires edge processing to meet any of these concerns, it likely requires one of these three types of solutions: an onboard edge device or general IoT device, a device-edge video AI box, or an on-premises edge video AI server.

Use Case: On-Board Edge Device or General IoT Device for Low-Latency Deployments

Computer vision technology is commonly embedded in edge or IoT devices such as drones, robot arms, or smart cameras. These deployments are autonomous, space- or power-constrained, or require the lowest possible AI latency. Intel offers purpose-built, low-power products that meet the diverse requirements and constraints of edge and embedded devices.

Depending on the complexity of your workloads, consider one of the following hardware combinations:

Intel® Core™ processors with built-in acceleration capabilities, such as Intel® Gaussian & Neural Accelerator (Intel® GNA) for an ultralow power accelerator, and an integrated GPU.
Intel® Core™ processors and an Intel® Iris® X^e Graphics discrete or integrated GPU.
Intel® Xeon® D processors with integrated Intel® Advanced Vector Extensions 512 (Intel® AVX-512) acceleration and the addition of a discrete GPU.

Read how a computer vision application built on Intel® hardware helped Signify, a lighting manufacturer, streamline quality control on the product line.

Use Case: Device-Edge Video AI Box for Medium-Complexity Deployments

Device-edge video AI box deployments feature a small number of cameras (around four to 10) that are directly attached to or stream to a single application for onboard AI processing. For example, computer vision applications in retail self-checkout stations use multiple sensors to identify which products a customer is purchasing, enabling customers to perform faster transactions and helping stores prevent theft. Since they require fewer camera streams, these deployments can be effectively supported using low-power processors.

If you need to protect against dust, grease, or other contaminants in your environment, we recommend semi-rugged or fully rugged hardware with passive or external cooling instead of an open chassis fan.

For small- to medium-complexity workloads and deployment in any environment:

Intel® Core™ processors with built-in acceleration capabilities, such as Intel® Gaussian & Neural Accelerator (Intel® GNA) for an ultralow power accelerator, and integrated GPUs.

For medium- to large-complexity workloads and deployment in standard IT environments and environments requiring a semi-rugged design, consider the following additional options:

Intel® Core™ processors and a discrete GPU, such as Intel® Arc™ graphics
4th Gen Intel® Xeon® Scalable processors for IoT Edge with built-in acceleration from Intel® Advanced Matrix Extensions (Intel® AMX) and a discrete data center GPU such as the Intel® Data Center GPU Flex 140 or 170

Use Case: On-Premises Edge Video AI Server for Advanced Deployments

For some advanced use cases, such as a medical imaging application that applies computer vision AI algorithms to detect diseases in X-ray results, your deployment may feature many remote cameras—sometimes 300 or more—that stream to a single, on-premises device for AI processing. As these deployments may support many cameras and run several computer vision models, you may need to consider hardware that delivers significant processing power.

You’ll also need to consider environmental conditions. If the video AI server will be located in a standard IT environment, you can choose hardware suited for a standard data center server or enterprise workstation. However, for environments with harsh conditions, you’ll need a rugged modular server.

For deployment in a standard IT environment, depending on your workload, consider:

4th Gen Intel® Xeon® Scalable processors
12th Gen Intel® Core™ processors plus third-party accelerators
4th Gen Intel® Xeon® Scalable processors and a discrete data center GPU, such as the Intel® Data Center GPU Flex 140 or 170

When deployment requires a rugged design, depending on your workload, consider:

4th Gen Intel® Xeon® Scalable processors for IoT Edge
4th Gen Intel® Xeon® Scalable processors and a discrete data center GPU, such as the Intel® Data Center GPU Flex 140 or 170

Read how Hellometer’s computer vision‒based restaurant automation solution is helping brands improve the drive-through experience. Using Intel® Core™ mobile processors with built-in AI acceleration and OpenVINO™ software, Hellometer enables restaurant operators to improve service speed by 47 seconds on average, which equates to an estimated USD 130K in added revenue per year per location³.

Building a Future-Proof Computer Vison Application with Intel® Vision Solutions

Intel offers a broad hardware portfolio and end-to-end AI pipeline software tools that can help you build a computer vision application with the right balance of performance and cost. Diverse hardware options deliver the processing power you need to deploy computer vision in any environment. Best of all, developers and data scientists can use our open source software tools, such as the Intel® Distribution of OpenVINO™ toolkit, to develop and optimize applications that easily scale across heterogeneous devices. By changing a few lines of code, you can equip a computer vision AI model trained on thousands of deep learning accelerators to run on a drone.

Find Intel® solutions that can power AI at every stage of the journey, at any scale.

Explore the Intel® Vision portfolio for computer vision solutions

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Computer Vision AI: Increase Automation and Efficiency by Seeing Data in a New Way