Structural Inference: Interpreting Small Language Models with Susceptibilities

We introduce "susceptibilities" along with a framework for applying these measurements to discovering structure inside models ("structural inference") and validate this in a small language model.

Authors
Garrett Baker=, George Wang=, Jesse Hoogland, Vinayak Pathak, Daniel Murfet
Timaeus · = Equal contribution
Published
April 25, 2025

Build on our work

Our tools for susceptibilities, local learning coefficients, and SGMCMC sampling are open source in the devinterp library.

Work with us

We're hiring Research Scientists, Engineers & more to join the team full-time.

Senior researchers can also express interest in a part-time affiliation through our new Research Fellows Program.