Timaeus Update October 2023
This is the first of a series of updates we plan to write monthly to keep you informed about our progress on Timaeus. We’ll share news on the Developmental Interpretability & Singular Learning Theory agenda as well as logistics and other information.
Timaeus Launch 🎉
- The main highlight is that we’ll be we’ll be announcing Timaeus publicly this week.
Research
So far, our colleagues and collaborators have put out two publications:
- Quantifying Degeneracy in Singular Models via the Learning Coefficient. This paper provides a technique for scalably estimating the learning coefficient, a theoretically grounded measure of model complexity with potential applications for interpretability and mechanistic anomaly detection. We’ve put out an accompanying distillation on LessWrong.
- Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition. We’re particularly thrilled about this second paper. It confirms that we’re not totally crazy: the phase transitions predicted by singular learning theory actually match empirically observed transitions to remarkable accuracy. (Distillation coming soon.)
Engineering
We’ve launched a Devinterp library to make it easy to apply the techniques explored in these papers to novel settings. We’ve included comprehensive examples.
Logistics
- We’ve secured fiscal sponsorship from Ashgro.
- We’re currently onboarding three research assistants.
Events
- We organized a hackathon on October 7-8, attended by 20-30 people.
- We’re organizing a conference in Oxford in November where we’ll present and discuss the research we’re working on.
- We’re organizing a “demo day” on November 25th where junior researchers will be able to present their research projects for feedback. We’re also building out a list of small project ideas to help people get started.
Looking Ahead
We’re making quick progress on our research agenda.
- Phase transitions are ubiquitous. Every week we’re seeing new examples published, so the hypothesis that phase transitions are commonplace is on its way to being affirmed independently of us.
- This has shifted our priority towards the next phase: finding “hidden” transitions and understanding the structure that emerges through these transitions.
- If you’re interested, join the discord and I can add you to the research-internal where we write weekly updates on all of the projects we’re working on.
We’ll have more to say about the particular systems and techniques we’re using to investigate these questions soon. Stay tuned.