Customization Guide
Adapt the curriculum to your program’s format and emphasis
The 16-week syllabi are designed as complete, ready-to-use courses. But not every program has 16 weeks, and not every audience has the same emphasis. This guide shows how to adapt.
10-Week Quarter Version (Foundations)
For quarter systems, compress the 16-week Foundations syllabus:
| Quarter Week | Content (from 16-week) | What Changes |
|---|---|---|
| 1 | Weeks 1–2 | Introduction + ML Systems combined; Module 01 only |
| 2 | Week 3 | ML Workflow; Module 02 starts |
| 3 | Weeks 4–5 | Data Engineering + Neural Computation; Module 02 + 03 |
| 4 | Week 6 | NN Architectures; Module 04 |
| 5 | Weeks 7–8 | Frameworks + Training; Module 05 + 06 |
| 6 | Week 9 | Data Selection; Module 07 |
| 7 | Week 10 | Model Compression + Lab 09 (Quantization); Module 08 |
| 8 | Week 11 | HW Acceleration + Lab 10 (Roofline) |
| 9 | Weeks 13–14 | Serving + Operations; Labs 12–13 |
| 10 | Week 16 | Capstone (AI Olympics, reduced scope) |
What gets dropped: Benchmarking (Week 12), Responsible Engineering (Week 15) — assign as optional reading. Integrate key responsibility points into the capstone rubric.
What gets compressed: TinyTorch Modules 01+02 doubled up in Weeks 1+3. Labs 00, 01, and 03 become optional.
The 10-week version sacrifices breathing room. Consider reducing Decision Logs to 100 words and assigning only 2 Design Challenges instead of 4.
3-Day Workshop Version
For short workshops with experienced practitioners:
| Day | Focus | Materials |
|---|---|---|
| Day 1 | The Physics of Inference | Iron Law introduction + Labs 01, 05, 09 (Magnitude Gap, Architecture Tradeoffs, Quantization) |
| Day 2 | The Optimization Frontier | Labs 10, 11 (Roofline, Benchmarking) + TinyTorch Module 08 speed-run |
| Day 3 | Production Deployment | Labs 12, 13 (Tail Latency, Drift Detection) + mini Design Challenge |
Focus exclusively on the Iron Law and Interactive Labs. Skip TinyTorch (except as demo). No formal assessment — use labs for hands-on discovery.
For Software Engineers
If your audience is experienced developers, lean into TinyTorch:
| Weeks | Focus | Modules |
|---|---|---|
| 1–4 | Building the Autograd Engine | Modules 01–06 (Tensor → Autograd) |
| 5–8 | From CNNs to Transformers | Modules 09–13 (Conv → Transformer) |
| 9–12 | Production Optimization | Modules 14–19 (Profiling → Benchmarking) |
| 13–16 | Capstone: Torch Olympics | Module 20 + competition |
Use textbook chapters as background reading, not lecture material. Labs serve as validation checkpoints, not primary pedagogy.
For Computer Architects
Shift the focus toward Hardware Acceleration and mlsysim:
- Use the hardware Zoo in
mlsysimto compare architectures (H100, B200, edge devices) - Spend 2 weeks on the Roofline model — have students plot multiple workloads
- Extend model compression to 2 weeks (quantization + pruning as hardware-aware optimizations)
- Use hardware kits extensively — make them mandatory, not optional
- Reduce TinyTorch to Modules 01–03 (enough to understand what frameworks do)
Graduate Seminar Version
For a graduate-level seminar (assumes strong systems background):
| Week | Topic | Textbook | Paper |
|---|---|---|---|
| 1 | The Iron Law | Vol I: Intro + ML Systems | Hennessy & Patterson, “A New Golden Age” (2019) |
| 2 | Memory Hierarchy | Vol I: HW Acceleration | Williams et al., “Roofline” (2009) |
| 3 | Quantization | Vol I: Model Compression | Dettmers et al., “LLM.int8()” (2022) |
| 4 | Serving Systems | Vol I: Model Serving | Yu et al., “Orca” (2022) |
| 5 | Distributed Training | Vol II: Distributed Training | Shoeybi et al., “Megatron-LM” (2020) |
| 6 | 3D Parallelism | Vol II: Distributed Training | Narayanan et al., “Efficient Large-Scale Training” (2021) |
| 7 | Collective Comms | Vol II: Collective Comm. | Patarasuk & Yuan, “Bandwidth Optimal All-Reduce” (2009) |
| 8 | Fault Tolerance | Vol II: Fault Tolerance | Jeon et al., “Large-Scale GPU Clusters” (2019) |
| 9 | Inference at Scale | Vol II: Inference | Kwon et al., “vLLM/PagedAttention” (2023) |
| 10 | KV-Cache Optimization | Vol II: Inference | Ainslie et al., “GQA” (2023) |
| 11 | Edge Intelligence | Vol II: Edge Intelligence | Lin et al., “MCUNet” (2020) |
| 12 | Fleet Operations | Vol II: Ops at Scale | Zhao et al., “ATC’24 Fleet Analysis” (2024) |
| 13 | Sustainability | Vol II: Sustainable AI | Patterson et al., “Carbon Emissions and AI” (2021) |
| 14 | Student presentations | — | — |
Assessment: 40% paper presentations, 30% lab Decision Logs (selected labs only), 30% semester project (original system design or benchmarking study).
Mixing and Matching Components
Each component is independently adoptable:
| Pattern | Components Used | Typical Context |
|---|---|---|
| Textbook Only | Vol I or II as required reading | Supplement for existing ML course |
| Textbook + Labs | Readings + interactive labs | Active learning without coding assignments |
| TinyTorch Only | 20 modules as programming assignments | Systems programming course |
| Labs Only | Interactive labs as in-class activities | Active learning supplement for any course |
| Hardware Kits Only | Edge deployment labs | Embedded systems course |
| Full Stack | All components integrated | Dedicated ML Systems course |
If adopting for the first time, start with Textbook + Labs for one semester. Add TinyTorch the second time you teach it. Add hardware kits the third. Each component is valuable on its own.