Sunday, March 16, 2014

Daniel Soudry: March 19th

Title: Mean Field Bayes Backpropagation: parameter-free training of multilayer neural networks with real and discrete weights

Recently, Multilayer Neural Networks (MNNs) have been trained to achieve state-of-the-art results in many classification tasks. The usual goal of the training is to estimate the parameters of a MNN, its weights, so they minimize some cost function. In theory, given a cost function, the optimal estimate can be found using their posterior given the data, which can be updated through Bayes theorem. In practice, this Bayesian approach is intractable. To circumvent this problem, we approximate the posterior using a factorized distribution and the central limit theorem. The resulting Mean Field Bayes BackPropagation algorithm is very similar to the standard Backpropagation algorithm. However, it has several advantages: (1) Training is parameter-free, given initial conditions (prior) and the MNN architecture. This is useful for large-scale problems, where parameter tuning is major challenge. Testing the algorithm numerically on MNIST, it achieves the same performance level as BackPropagation with the optimal constant learning rate. (2) The weights can be restricted to have discrete values. This is especially useful for implementing trained MNNs in precision limited hardware chips. This can improve their speed and energy efficiency by several orders of magnitude, thus enabling their integration into small and low-power electronic devices. We show that on MNIST, the algorithm can be used to train MNNs with binary weights with only mild reduction in performance - in contrast to weight quantization, which significantly increases the error.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.