Deep learning networks have achieved state-of-the-art accuracies on computer vision workloads like image classification and object detection. The performant systems, however, typically involve big models with numerous parameters. Once trained, a challenging aspect for such top performing models is deployment on resource constrained inference systems -- the models (often deep networks or wide networks or both) are compute and memory intensive...
Authors
Debbie Marr
Related Content
WRPN: Wide Reduced-Precision Networks
For computer vision applications, prior works have shown the efficacy of reducing numeric precision of model parameters (network weights) in...
Mixed Precision Training of Convolutional Neural Networks Using...
The state-of-the-art (SOTA) for mixed precision training is dominated by variants of low precision floating point operations, and in particular,...
Intel® nGraph™: An Intermediate Representation, Compiler, and Executor...
The Deep Learning (DL) community sees many novel topologies published each year. Achieving high performance on each new topology remains...
A Progressive Batching L-BFGS Method for Machine Learning...
The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent...