Video details loadedContinue
HomeMIT 6.7960 Deep Learning, Fall 2024Lec 07. Scaling Rules for Optimization
MIT 6.7960 Deep Learning, Fall 2024
Video 7 of 10
Lec 07. Scaling Rules for Optimization
1:20:56
Up Next
Lec 08. Architectures: Transformers
This video explores neural computation from a spectral perspective, discusses feature learning and hyperparameter transfer, and presents scaling rules for transferring hyperparameters across network width and depth.