In this paper, we explore FPGA miniﬂoat implementations (ﬂoating-point representations with non-standard exponent and mantissa sizes), and show the use of a block-ﬂoating point implementation that shares the exponent across many numbers, reducing the logic required to perform ﬂoating-point operations.
In this paper, we introduce a domain-specifc approach to overlays that leverages both software and hardware optimizations to achieve state-of-the-art performance on the FPGA for neural network acceleration.
This paper examines ﬂexibility, and its impact on FPGA design methodology, physical design tools and computer-aided design (CAD). We describe the degrees of ﬂexibility required to create efficient deep learning accelerators.
This white paper examines the future of deep neural networks, including sparse networks, low precision, and ultra-low precision, and compares the performance of Intel® Arria® 10 and Intel® Stratix® 10 FPGAs against NVIDIA graphics processing units (GPUs).
- Accelerating Deep Learning with the OpenCL™ Platform and Intel® Stratix® 10 FPGAs ›
This white paper describes how Intel® FPGAs leverage the OpenCLTM platform to meet the image processing and classification needs of today's image-centric world.
This white paper provides a detailed look at the architecture and performance of our Deep Learning Accelerator intellectual property (IP) core.
Build high-performance computer vision applications with integrated deep learning inference
The Intel® Vision Accelerator Design with Intel Arria 10 FPGA offers exceptional performance, flexibility, and scalability for deep learning and computer vision solutions.