This paper presents incremental network quantization (INQ), a novel method, targeting to efficiently convert any pre-trained full-precision convolutional neural network (CNN) model into a low-precision version whose weights are constrained to be either powers of two or zero. Unlike existing methods which are struggled in noticeable accuracy loss, our INQ has the potential to resolve this issue, as benefiting from two innovations...
Authors
Yurong Chen
Senior Research Director & Principle Research Scientist, Cognitive Computing Lab, Intel Labs China
Aojun Zhou
Lin Xu
Related Content
On Sampling from Massive Graph Streams
We propose Graph Priority Sampling (GPS), a new paradigm for order-based reservoir sampling from massive streams of graph edges. GPS....
Stream Aggregation through Order Sampling
This is paper introduces a new single-pass reservoir weighted-sampling stream aggregation algorithm, Priority-Based Aggregation (PBA). While order sampling is a....
Boosting Dilated Convolutional Networks with Mixed Tensor Decompositions
The driving force behind deep networks is their ability to compactly represent rich classes of functions. The primary notion for....
Accurate Optical Flow via Direct Cost Volume Processing
We present an optical flow estimation approach that operates on the full four-dimensional cost volume. This direct approach shares the....