Timaeus is merging into Resolution. Read the announcement

Structural Inference: Interpreting Small Language Models with Susceptibilities

We introduce "susceptibilities" along with a framework for applying these measurements to discovering structure inside models ("structural inference") and validate this in a small language model.

Authors

Garrett Baker⁼, George Wang⁼, Jesse Hoogland, Vinayak Pathak, Daniel Murfet

Timaeus · = Equal contribution

Published

April 25, 2025

Links

ArXiv Video How to Cite

Build on our work

Our tools for susceptibilities, local learning coefficients, and SGMCMC sampling are open source in the devinterp library.

Work with us

We're hiring Research Scientists, Engineers & more to join the team full-time.

Senior researchers can also express interest in a part-time affiliation through our new Research Fellows Program.