Introducing principles of synaptic integration in the optimization of deep neural networks

Giorgia Dellaferrera; Stanisław Woźniak; Giacomo Indiveri; Angeliki Pantazi; Evangelos Eleftheriou

doi:10.1038/s41467-022-29491-2

Nature Communications

Paper

06 Apr 2022

Introducing principles of synaptic integration in the optimization of deep neural networks

Download paper

Abstract

Plasticity circuits in the brain are known to be influenced by the distribution of the synaptic weights through the mechanisms of synaptic integration and local regulation of synaptic strength. However, the complex interplay of stimulation-dependent plasticity with local learning signals is disregarded by most of the artificial neural network training algorithms devised so far. Here, we propose a novel biologically inspired optimizer for artificial and spiking neural networks that incorporates key principles of synaptic plasticity observed in cortical dendrites: GRAPES (Group Responsibility for Adjusting the Propagation of Error Signals). GRAPES implements a weight-distribution-dependent modulation of the error signal at each node of the network. We show that this biologically inspired mechanism leads to a substantial improvement of the performance of artificial and spiking networks with feedforward, convolutional, and recurrent architectures, it mitigates catastrophic forgetting, and it is optimally suited for dedicated hardware implementations. Overall, our work indicates that reconciling neurophysiology insights with machine intelligence is key to boosting the performance of neural networks.

Authors’ notes

We introduce GRAPES (i.e. Group Responsibility for Adjusting the Propagation of Error Signals), an optimization strategy that relies on the notion of node importance in propagating the error information during learning. The node importance identifies the neurons that have a large number of strong connections and therefore are responsible for substantially amplifying or attenuating the input signal during its propagation to the downstream layers. The underlying concept of node importance is inspired by the process of synaptic integration in biological circuits. This is a non-linear mechanism through which the dendritic branches receiving input from multiple strong connections have a higher probability of boosting the incoming signal as it travels to the soma, compared to dendritic branches with on average weak incoming connections. 

By translating this concept to artificial neural networks, GRAPES, as its name indicates, exploits the collective information of the multiple connections of a neuron to modulate the parameter updates during learning — that is, synaptic plasticity is influenced by the distribution of the weights within layers. Our results demonstrate that GRAPES not only provides substantial improvements in the performance of deep artificial and spiking neural networks, but it also mitigates the accuracy degradation due to catastrophic forgetting, which prevents the successful reproduction of continual learning like the one occurring in the human brain.

These results open a new avenue towards narrowing the gap between backpropagation and biologically plausible learning schemes.

Catastrophic forgetting

Catastrophic forgetting refers to the phenomenon affecting neural networks by which the process of learning a new task causes a sudden and dramatic degradation of the knowledge previously acquired by the system. This represents a key limitation of current AI systems, that struggle to reproduce continual learning. Compared to the current solutions, we show that the application of GRAPES mitigates the effects of catastrophic forgetting without introducing additional training steps, such as replaying past.

We suggest that these properties stem from the fact that GRAPES effectively combines in the error signal information related to the response to the current input with information on the internal state of the network, independent of the data sample.

Harvesting GRAPES for neuro-inspired AI

Our optimization method enables faster learning of a wide range of training algorithms and at the same time, mitigates the phenomenon of catastrophic forgetting. For this reason, our optimizer could have a beneficial impact on applications requiring fast learning in a context where new inputs are presented in a non-uniform distribution, for example: the learning of a robot exploring a new environment.

Fig. 1: GRAPES — FIG. 1: (a) In biological synapses, during the process of synaptic integration, dendritic spikes can enhance the impact of synchronous inputs from dendrites belonging to the same dendritic tree. Excitatory postsynaptic potentials (EPSPs) with the same amplitude but different locations in dendritic tree may lead to different responses. For example, dendrites i, iv and viii send similar signals, but only the i and iv contribute in driving an AP, since their respective dendritic trees receive sufficient further excitation from other connected dendrites. In the top image, the postsynaptic neuron (dark blue) receives inputs mostly from dendrites generating strong EPSPs (orange) and only few generating weak EPSPs (yellow). The bottom postsynaptic neuron (light blue) receives most inputs from weak-EPSPs dendrites. Because of such dendritic distribution, the dark blue neuron exhibits higher firing probability and thus its importance is higher with respect to the light blue neuron. (b) The structure of an FCNN is much simpler than that of biological neurons with presynaptic connections arranged in dendritic trees. However, analogously to panel (a), the node importance of each node arises from the distribution of the weight strength within each layer. The blue node has a high node importance since most of its incoming synapses are strong. Conversely, the light blue node importance is lower, since the presynaptic population exhibits a weaker mean strength.

Additionally, biologically plausible training schemes such as feedback alignment greatly benefit from our optimization method, therefore hardware devices that cannot support backpropagation, but only feedback alignment, are a promising area for application of our work.

Furthermore, GRAPES improves the performance of spiking neural networks (SNNs). SNNs offer an energy-efficient alternative for implementing deep learning applications; however, they still lag behind artificial neural networks (ANNs) in terms of accuracy. Our work paves the way for biologically inspired algorithms to narrow the gap between the performance of SNNs and ANNs, enabling applications in the rapidly growing field of neuromorphic chips.

Conclusion

Our results demonstrate that GRAPES not only provides substantial improvements in the performance of deep artificial and spiking neural networks, but it also mitigates the accuracy degradation due to catastrophic forgetting.

Conference paper

Symmetries, flat minima, and the conserved quantities of gradient flow

Bo Zhao, Iordan Ganev, et al.

ICLR 2023

Short paper

Not to Overfit or Underfit the Source Domains? An Empirical Study of Domain Generalization in Question Answering

Arafat Sultan, Avi Sil, et al.

EMNLP 2022

Workshop paper

Regression Transformer: Concurrent Conditional Generation and Regression by Blending Numerical and Textual Tokens

Jannis Born, Matteo Manica

ICLR 2022

Conference paper

Sharpness-Aware Data Poisoning Attack

Pengfei He, Han Xu, et al.

ICLR 2024

View all publications