Embryology of a Language Model

We study the susceptibilities introduced previously as a tool for developmental interpretability to study the embryology of a small language model over training.

Authors
George Wang, Garrett Baker, Andrew Gordon, Daniel Murfet
Timaeus

Build on our work

Our tools for susceptibilities, local learning coefficients, and SGMCMC sampling are open source in the devinterp library.

Work with us

We're hiring Research Scientists, Engineers & more to join the team full-time.

Senior researchers can also express interest in a part-time affiliation through our new Research Fellows Program.