Deep Neural Network Compression

Use of dynamic programming to better quantize deep neural networks

Exact optimal clustering in 1-dimension can be done in polynomial time. As such, during the training of a deep neural network, an additional loss term that incentivizes parameters to be quantization and compression friendly can be calculated to exact value. In experiment on well known image classification models, our quantization algorithm broke several compression record at the time, without harming model accuracy.