Review of Complexity Measures

A comprehensive review and comparison of different notions of effective dimensionality in machine learning models.

Type: Applied
Difficulty: Hard
Status: Unstarted

Learning theorists have studied many different notions of effective dimensionality. Of these, the learning coefficient is the most theoretically well-founded. However, it is not clear how the learning coefficient relates to other notions of effective dimensionality, such as the Hessian rank, or the dimensionality of the tangent space.

This project aims to provide a comprehensive review of various notions of effective dimensionality in machine learning models. Key questions to address include:

  1. What are the main notions of effective dimensionality that have been studied in the literature?
  2. How do these different measures relate to one another theoretically?
  3. How do they compare empirically when applied to real-world models?
  4. What are the strengths and limitations of each measure?
  5. How does the learning coefficient from Singular Learning Theory compare to these other measures?

The review should cover both theoretical aspects and empirical comparisons. Potential measures to consider include:

  • Learning coefficient (from SLT)
  • Hessian rank
  • Tangent space dimensionality
  • VC dimension
  • Rademacher complexity
  • Intrinsic dimension
  • Effective degrees of freedom

This review would provide valuable context for the developmental interpretability agenda and help situate the learning coefficient within the broader landscape of model complexity measures.

Where to begin:

If you have decided to start working on this, please let us know in the Discord. We'll update this listing so that other people who are interested in this project can find you.