Physics of AI
We propose an approach to the science of deep learning that roughly follows what physicists do to understand reality: (1) explore phenomena through controlled experiments, and (2) build theories based on toy mathematical models and non-fully- rigorous mathematical reasoning. I illustrate (1) with the LEGO study (LEGO stands for Learning Equality and Group Operations), where we observe how transformers learn to solve simple linear systems of equations. I will also briefly illustrate (2) with an analysis of the emergence of threshold units when training a two-layers neural network to solve a simple sparse coding problem. The latter analysis connects to the recently discovered Edge of Stability phenomenon.
- Date:
- Speakers:
- Sébastien Bubeck
- Affiliation:
- Microsoft Research
-
-
Sébastien Bubeck
Vice President, Microsoft GenAI
-
-
Watch Next
-
-
Generative AI meets Structural Biology: Equilibrium Distribution Prediction
Speakers:- Shuxin Zheng
-
-
-
-
-
-
MEGA: Multi-lingual Evaluation of Generative AI
Speakers:- Kabir Ahuja,
- Millicent Ochieng
-
-