From Theory to Practice

We use singular learning theory to study how training data shapes model behavior.

We use this understanding to develop new tools for AI safety. Read more.

Read our research

Browse publications, preprints, and technical notes on singular learning theory & AI safety.

Learn about SLT

Deep dive into SLT, Developmental Interpretability, and AI Alignment research.

October 14, 2025

Influence Dynamics and Stagewise Data Attribution

By Lee et al.

Current training data attribution (TDA) methods treat the influence one sample has on another as static, but neural networks learn in distinct stages that exhibit changing patterns of influence.

Learn More → ArXiv

Influence Dynamics and Stagewise Data Attribution

October 14, 2025

Compressibility Measures Complexity: Minimum Description Length Meets Singular Learning Theory

By Urdshals et al.

We study neural network compressibility by using singular learning theory to extend the minimum description length (MDL) principle to singular models like neural networks.

Learn More → ArXiv

Compressibility Measures Complexity: Minimum Description Length Meets Singular Learning Theory

October 1, 2025

The Loss Kernel: A Geometric Probe for Deep Learning Interpretability

By Adam et al.

We introduce the loss kernel, an interpretability method for measuring similarity between data points according to a trained neural network.

Learn More → ArXiv

The Loss Kernel: A Geometric Probe for Deep Learning Interpretability

September 30, 2025

Bayesian Influence Functions for Hessian-Free Data Attribution

By Kreer et al.

Classical influence functions face significant challenges when applied to deep neural networks, primarily due to non-invertible Hessians and high-dimensional parameter spaces.

Learn More → ArXiv

Bayesian Influence Functions for Hessian-Free Data Attribution

August 1, 2025

Embryology of a Language Model

By Wang et al.

Understanding how language models develop their internal computational structure is a central problem in the science of deep learning.

Learn More → ArXiv rainbowserpent.dev Thread

July 29, 2025

From Global to Local: A Scalable Benchmark for Local Posterior Sampling

By Hitchcock and Hoogland

Degeneracy is an inherent feature of the loss landscape of neural networks, but it is not well understood how stochastic gradient MCMC (SGMCMC) algorithms interact with this degeneracy.

Learn More → ArXiv

From Global to Local: A Scalable Benchmark for Local Posterior Sampling

April 25, 2025

Modes of Sequence Models and Learning Coefficients

By Chen and Murfet

We develop a geometric account of sequence modelling that links patterns in the data to measurable properties of the loss landscape in transformer networks.

Learn More → ArXiv

April 25, 2025

Structural Inference: Interpreting Small Language Models with Susceptibilities

By Baker et al.

We develop a linear response framework for interpretability that treats a neural network as a Bayesian statistical mechanical system.

Learn More → ArXiv Video

Structural Inference: Interpreting Small Language Models with Susceptibilities

April 10, 2025

Programs as Singularities

By Murfet and Troiani

We develop a correspondence between the structure of Turing machines and the structure of singularities of real analytic functions, based on connecting the Ehrhard-Regnier derivative from linear logic with the role of geometry in Watanabe's singular learning theory.

Learn More → ArXiv Video

February 8, 2025

You Are What You Eat – AI Alignment Requires Understanding How Data Shapes Structure and Generalisation

By Lehalleur et al.

In this position paper, we argue that understanding the relation between structure in the data distribution and structure in trained models is central to AI alignment.

Learn More → ArXiv Thread

You Are What You Eat – AI Alignment Requires Understanding How Data Shapes Structure and Generalisation

January 30, 2025

Structure Development in List-Sorting Transformers

By Urdshals and Urdshals

ICML SMUNN Workshop

We study how a one-layer attention-only transformer develops relevant structures while learning to sort lists of numbers.

Learn More → ArXiv

Structure Development in List-Sorting Transformers

January 29, 2025

Dynamics of Transient Structure in In-Context Linear Regression Transformers

By Carroll et al.

Modern deep neural networks display striking examples of rich internal computational structure.

Learn More → ArXiv

Dynamics of Transient Structure in In-Context Linear Regression Transformers

October 4, 2024

Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient

By Wang et al.

ICLR Spotlight

We introduce refined variants of the Local Learning Coefficient (LLC), a measure of model complexity grounded in singular learning theory, to study the development of internal structure in transformer language models during training.

Learn More → ICLR ArXiv Thread

Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient

February 4, 2024

Loss Landscape Degeneracy and Stagewise Development of Transformers

By Hoogland et al.

TMLR Best Paper at 2024 ICML HiLD Workshop

We show that in-context learning emerges in transformers in discrete developmental stages, when they are trained on either language modeling or linear regression tasks.

Learn More → TMLR HiLD Workshop 2024 ArXiv Blog Thread Video

Loss Landscape Degeneracy and Stagewise Development of Transformers

August 23, 2023

The Local Learning Coefficient: A Singularity-Aware Complexity Measure

By Lau et al.

AISTATS 2025

Deep neural networks (DNN) are singular statistical models which exhibit complex degeneracies.

Learn More → AISTATS ArXiv Blog

The Local Learning Coefficient: A Singularity-Aware Complexity Measure

Join Timaeus

Join our growing team of dedicated researchers applying cutting-edge mathematics to prevent AI risks that could affect billions of people.