Publication
NeurIPS 2024
Workshop paper

Memorization to Generalization: The Emergence of Diffusion Models from Associative Memory

Abstract

Hopfield networks are associative memory systems, designed for storing and retrieving specific patterns as local minima of an energy landscape. In the classical Hopfield model, an interesting phenomenon occurs when the model's memorization capacity reaches its critical memory load - spurious states, or unintended stable points, emerge at the end of the retrieval dynamics. These particular states often appear as mixtures of the stored patterns, leading to incorrect recall. In this work, we propose that these spurious states are not necessarily a negative feature of retrieval dynamics, but rather that they serve as the onset of generalization. We employ diffusion models, commonly used in generative modelling, to demonstrate that their generalization stems from a phase transition which occurs as the number of training samples is increased. In the low data regime the model exhibits a strong memorization phase, where the network creates a distinct basin of attraction for each sample in the training set, akin to the Hopfield model below the critical memory load. In the large data regime a different phase appears where an increase in the training set size fosters the creation of new attractor states that correspond to manifolds of the generated samples. Spurious states appear at the boundary of this transition and correspond to emergent attractor states, which are absent in the training set, but, at the same time, still have a distinct basin of attraction around them. From the perspective of Hopfield description these spurious states correspond to mixtures of "fundamental memories" which facilitate generalization through the superposition of underlying features, resulting in the creation of novel samples. Our findings provide a novel perspective on the memorization-generalization phenomenon in diffusion models via the lens of Hopfield networks, which illuminate the previously underappreciated view of diffusion models as Hopfield networks above the critical memory load.