Scaling Equilibrium Propagation to Deeper Neural Network Architectures
Elayedam, Srinivasan
Equilibrium propagation has been proposed as a biologically plausible alternative to the backpropagation algorithm. The local nature of gradient computations, combined with the use of convergent RNNs to reach equilibrium states, make this approach well-suited for implementation on neuromorphic hardware. However, previous studies on equilibrium propagation have been restricted to networks containing only dense layers or relatively small architectures with a few convolutional layers followed by a final dense layer. These networks have a significant gap in accuracy compared to similarly sized feedforward networks trained with backpropagation. In this work, we introduce the Hopfield-Resnet architecture, which incorporates residual (or skip) connections in Hopfield networks with clipped $\mathrm{ReLU}$ as the activation function. The proposed architectural enhancements enable the training of networks with nearly twice the number of layers reported in prior works. For example, Hopfield-Resnet13 achieves 93.92\% accuracy on CIFAR-10, which is $\approx$3.5\% higher than the previous best result and comparable to that provided by Resnet13 trained using backpropagation.
academic
Scaling Equilibrium Propagation to Deeper Neural Network Architectures
Equilibrium Propagation (EP) has been proposed as a biologically plausible alternative to backpropagation. The local nature of its gradient computation, combined with the use of converged RNNs to reach equilibrium states, makes this approach particularly suitable for implementation on neuromorphic hardware. However, previous research on equilibrium propagation has been limited to networks containing dense layers or relatively small architectures, which exhibit significant accuracy gaps compared to similarly-sized feedforward networks trained with backpropagation. This work introduces the Hopfield-Resnet architecture, which integrates residual connections within Hopfield networks and employs clipped ReLU as the activation function. The proposed architectural enhancements enable networks to train with nearly twice the depth reported in previous work. For example, Hopfield-Resnet13 achieves 93.92% accuracy on CIFAR-10, approximately 3.5% higher than the previous best result and comparable to Resnet13 trained with backpropagation.
The core problem addressed in this research is the scalability of the Equilibrium Propagation (EP) method for deep neural networks, manifested specifically as:
Depth Limitation: Existing EP methods can only effectively train shallow networks (≤6 layers)
Performance Gap: Networks trained with EP exhibit significant performance degradation compared to same-scale networks trained with backpropagation
Biological Plausibility Requirement: The need to maintain the biological plausibility advantages of the EP method
This work investigates how to train deep convolutional neural networks for image classification tasks using the equilibrium propagation method. The input is an image x, the output is a class label y, with the constraint of maintaining the biological plausibility and local gradient computation characteristics of the EP method.
Experiments demonstrate that deep networks without residual connections maintain stagnant training loss, while networks with residual connections successfully converge.
Scellier, B. & Bengio, Y. (2017). Equilibrium propagation: Bridging the gap between energy-based models and backpropagation. Frontiers in Computational Neuroscience.
Laborieux, A. et al. (2021). Scaling equilibrium propagation to deep convnets by drastically reducing its gradient estimator bias. Frontiers in Neuroscience.
Laborieux, A. & Zenke, F. (2022). Holomorphic equilibrium propagation computes exact gradients through finite size oscillations. NeurIPS.
He, K. et al. (2016). Deep residual learning for image recognition. CVPR.
This paper achieves important breakthroughs in extending equilibrium propagation to deep networks. Through ingenious architectural design, it significantly enhances the practical utility of the EP method, making valuable contributions to the development of neuromorphic computing and bio-inspired learning algorithms.