Applied Research

SGD vs. Bayes in Toy Landscapes

SLT is about Bayesian learning. What can it say about SGD?

Project Details

Status: In-progress
Difficulty: Hard
Type: Applied

Team & Contact

Lead: Guillaume Corlouer
Discord: guillaume5439

Update

See Corlouer and Macé’s update here.


SGD is a theory about Bayesian learning. The transitions we encounter are “quasistatic”: they don’t really involve a time aspect and instead involve an equilibrium distribution changing as a function of the number of samples. What does this say about SGD, which is non-equilibrium and inherently dynamic?

Chen et al. 2023 look at one possible link in terms of the Bayesian Antecedent Hypothesis (BAH): that dynamical transitions in SGD are “backed” by an underlying Bayesian transition. We don’t hold this hypothesis particularly strongly, and it would be interesting to look for violations.

One way to explore this question is to investigate the differences between these learning processes in toy settings. The person to talk to about this is Guillaume Corlouer (guillaume5439 in the discord).

Where to Begin

Before starting this project, we recommend familiarizing yourself with these resources:

Ready to contribute? Let us know in our Discord community . We'll update this listing so that other people interested in this project can find you.