Video details loaded
HomeMIT 6.7960 Deep Learning, Fall 2024Lec 07. Scaling Rules for Optimization

Lec 07. Scaling Rules for Optimization

1:20:56

Up Next

Lec 08. Architectures: Transformers

Continue

This video explores neural computation from a spectral perspective, discusses feature learning and hyperparameter transfer, and presents scaling rules for transferring hyperparameters across network width and depth.