Project Ideas

This page outlines various project ideas to inspire you if you're interested in getting involved in DevInterp. A good place to start is with the open problems lecture from the 2023 DevInterp conference:

Starter Notebooks

Before diving into a new project, we recommend building familiarity by going through some of the starter notebooks in the devinterp repo. These notebooks can also serve as a starting point for further investigation.

Active Projects

We encourage replication but discourage scooping each other: there are enough interesting problems to solve that we shouldn't be unnecessarily duplicating effort, since it slows progress in AI safety and is bad for the community. That said, if you're particularly interested in one of the projects that's active, please reach out to see if there's an opportunity to get involved and collaborate.

DevInterp-Flavored Projects

The number one thing we encourage people to do who want to get involved in DevInterp, especially if they're interested in the empirical aspect, is to just go out and study the development of models that haven't been studied yet.

The easiest place to start is transformers trained on algorithmic tasks. Just choose one (or come up with your own), and start applying tools from devinterp (such as local learning coefficient estimation, essential dynamics (coming soon), Oku-Aihara covariance analysis (ditto), etc.) as well as more "traditional" tools from mechinterp (i.e., "progress measures").

SLT-Flavored Projects

If you're interested in something slightly more theoretical, there many interesting questions in the context of SLT.

Engineering Projects

Are you more of an research engineer than research scientist? Consider filing a PR and adding features/fixing bugs in the devinterp repo. There's plenty to do.

Theoretical Projects

We discourage you from working on more theoretical projects unless you really, really know what you're doing. Reach out to us.

Completed Projects

Just because a project is marked as "completed" here doesn't mean that this direction is closed off. It's often very helpful to begin with replications because it gives you a clear reference to compare results against. You're also sure to run into follow-up questions that the original authors didn't address, so you can always go deeper.