Volume II: At Scale
Machine Learning Systems at Scale
18 decks covering distributed infrastructure, training, deployment, and governance across GPU fleets. 529 slides, 125 SVG figures, approximately 19 hours of teaching material.
| Ch | Title | Slides | SVGs | ~Time | Active Learning | PPTX | Source | |
|---|---|---|---|---|---|---|---|---|
| 0 | ML Systems at Scale | 24 | 5 | 49 min | 9 | PPTX | Source | |
| 1 | Introduction | 34 | 10 | 76 min | 9 | PPTX | Source | |
| 2 | Compute Infrastructure | 33 | 7 | 72 min | 10 | PPTX | Source | |
| 3 | Network Fabrics | 32 | 9 | 68 min | 8 | PPTX | Source | |
| 4 | Data Storage | 33 | 7 | 69 min | 9 | PPTX | Source | |
| 5 | Distributed Training Systems | 32 | 7 | 70 min | 9 | PPTX | Source | |
| 6 | Collective Communication | 29 | 7 | 64 min | 8 | PPTX | Source | |
| 7 | Fault Tolerance and Reliability | 32 | 9 | 68 min | 8 | PPTX | Source | |
| 8 | Fleet Orchestration | 33 | 8 | 74 min | 9 | PPTX | Source | |
| 9 | Inference at Scale | 32 | 8 | 71 min | 10 | PPTX | Source | |
| 10 | Performance Engineering | 32 | 8 | 72 min | 8 | PPTX | Source | |
| 11 | Edge Intelligence | 31 | 6 | 66 min | 10 | PPTX | Source | |
| 12 | ML Operations at Scale | 31 | 8 | 66 min | 9 | PPTX | Source | |
| 13 | Security and Privacy | 33 | 10 | 71 min | 10 | PPTX | Source | |
| 14 | Robust AI | 33 | 8 | 71 min | 8 | PPTX | Source | |
| 15 | Sustainable AI | 31 | 7 | 65 min | 9 | PPTX | Source | |
| 16 | Responsible AI | 31 | 7 | 66 min | 11 | PPTX | Source | |
| 17 | Conclusion | 23 | 7 | 47 min | 10 | PPTX | Source | |
| Total | 529 | 125 | ~19 hrs | 163 |
Tip
PPTX files are image-based (300 DPI) — visually identical to the PDF. Use them for PowerPoint presenter mode and slide annotations. For editable slides, download the LaTeX source.