For Instructors

Reproducible, hardware-independent exercises — paired with 35 lecture decks and 266 diagrams.

MLSYSIM provides a framework for assigning analytically grounded problem sets where every answer is deterministic and reproducible — regardless of what hardware your students have access to. Combined with the companion lecture slides, it forms a complete teaching toolkit for ML systems courses.


Why MLSYSIM for Teaching?

Challenge How MLSYSIM Helps
Students lack GPU access All analysis runs on a laptop — no cloud credits needed
Homework answers vary by hardware Vetted registry specs produce identical results everywhere
Hard to grade open-ended systems questions Analytical solvers give deterministic, verifiable outputs
Specifications become stale Registry updated from official datasheets; one update propagates everywhere
Students memorize without understanding “Predict first” exercises build genuine intuition
No time to build slides from scratch 35 Beamer decks with speaker notes, active learning, and SVG diagrams ready to use

The Teaching Ecosystem

MLSYSIM is one component of a larger open teaching toolkit:

Resource What It Provides Link
Textbook Two-volume open textbook — foundations (Vol I) and scale (Vol II) mlsysbook.ai
Lecture Slides 35 Beamer decks, 1,099 slides, 266 SVG diagrams, speaker notes on every slide Slides Portal
MLSYSIM 6 analytical solvers, typed hardware registry, deterministic assignments Getting Started
TinyML Courseware 4-course sequence with 178 slide decks for embedded ML TinyML Slides
Teaching Guide 16-week semester plans, active learning taxonomy, customization guide Teaching Guide

Course Integration Patterns

Pattern 1 — Textbook Companion (Full Semester)

Map MLSYSIM tutorials and assignments directly to textbook chapters and lecture decks. The table below shows one possible 16-week arrangement using Volume I slides.

Week Lecture Slides Textbook Topic MLSYSIM Assignment
2 Introduction The Iron Law of ML Systems Read Hello World warmup — identify bottleneck equation
5 NN Computation FLOPs, memory footprint Hello World — roofline analysis, batch size sweep
8 Model Training Training memory budget Solver Guide — TrainingStateSolver, ZeRO stages
11 HW Acceleration Roofline model, accelerator comparison Hardware comparison assignment (see below)
13 Model Serving TTFT, ITL, KV-cache LLM Serving — serving latency analysis

For a Volume II course on distributed systems:

Week Lecture Slides Textbook Topic MLSYSIM Assignment
3 Compute Infrastructure GPU clusters, interconnects TCO analysis with EconomicsModel
5 Distributed Training 3D parallelism, scaling Distributed Training — parallelism strategies
7 Fault Tolerance Checkpointing, MTBF ReliabilityModel — Young-Daly checkpoint interval
10 Performance Engineering Profiling, optimization Multi-solver composition (see capstone ideas below)
15 Sustainable AI Energy, carbon, water Sustainability Lab — carbon footprint
TipSemester Plans

The Teaching Guide provides complete 16-week schedules for Volume I, Volume II, and a combined 32-week sequence — with timing estimates for every deck.

Pattern 2 — Standalone Labs

Use individual tutorials as self-contained lab assignments in any systems course. Each tutorial includes exercises with clear expected outputs:

Tutorial Duration Key Concepts Pairs With Slides
Hello World 15 min Roofline model, memory vs. compute bound HW Acceleration
Sustainability Lab 20 min Energy, carbon footprint, regional grids Sustainable AI
LLM Serving 25 min TTFT vs. ITL, KV-cache pressure Model Serving
Distributed Training 30 min Data/tensor/pipeline parallelism Distributed Training

Pattern 3 — Capstone Projects

Advanced students compose multiple solvers to answer research-style questions. See Extending MLSYSIM for the custom solver API.


Assignment Ideas

Homework: Hardware Comparison (30 min)

Using Engine.solve(), compare ResNet-50 inference latency on the A100, H100, and Jetson AGX at batch sizes 1, 32, and 256. For each configuration, state whether the workload is memory-bound or compute-bound and explain why the bottleneck shifts with batch size.

Pairs with: HW Acceleration slides (roofline model, ridge point) and Benchmarking slides (measurement methodology).

Homework: Training Memory Budget (30 min)

Using the TrainingStateSolver, calculate the memory required to train GPT-2 (1.5B parameters) in FP16 with Adam optimizer under ZeRO Stage 0, Stage 1, and Stage 3. Explain why each stage reduces memory and what trade-off it introduces.

Pairs with: Model Training slides and Distributed Training slides.

Lab: Carbon-Aware Training (45 min)

Using the SustainabilityModel, calculate the carbon footprint of training GPT-3 on a 256-GPU H100 cluster in Quebec vs. US Average vs. Poland. Produce a table and a 2-paragraph analysis of why datacenter location matters more than hardware choice for carbon.

Pairs with: Sustainable AI slides (grid carbon intensity, PUE).

Lab: LLM Serving Capacity Planning (45 min)

Using the ServingModel, determine the maximum sequence length at which Llama-3.1-70B can serve a single request on an 8-GPU H100 node without exceeding memory. Then calculate TTFT and ITL at sequence lengths of 1K, 4K, and 16K tokens. At what point does KV-cache pressure dominate?

Pairs with: Model Serving slides and Inference at Scale slides.

Exam Question: Back-of-Envelope

A GPU has 1,979 TFLOP/s peak compute (FP16) and 3.35 TB/s memory bandwidth. (a) What is the ridge point in FLOP/Byte? (b) A model layer has arithmetic intensity of 50 FLOP/Byte — is it compute-bound or memory-bound? (c) Another layer has arithmetic intensity of 400 FLOP/Byte — which regime is it in, and what does that imply about the benefit of moving to a GPU with 2x the bandwidth? Show your work.

Pairs with: HW Acceleration slides (roofline model, ridge point derivation).

Capstone: Multi-Solver Design Study (1 week)

Design a training cluster for a 70B-parameter model. Use the DistributedModel to select a parallelism strategy, the EconomicsModel for TCO over 6 months, the SustainabilityModel to compare three datacenter locations, and the ReliabilityModel to determine checkpoint frequency. Present your analysis as a 3-page technical memo with quantitative justification for each decision.

Pairs with: the full Volume II slide set — infrastructure, training, fault tolerance, and sustainability.


Grading Notes

Because MLSYSIM produces deterministic output from vetted specifications:

  • Answer keys are stable — the same mlsysim version produces identical numbers for every student, every semester
  • Partial credit is straightforward — grade the reasoning (which solver, which inputs, which bottleneck explanation), not just the number
  • “Predict first” questions are easy to assess — students submit their prediction before running code; compare the two for a conceptual understanding score
NoteVersion Pinning

Pin the version in your assignment instructions (pip install mlsysim==0.1.0) so answer keys remain valid even after new releases update specifications.


Reproducibility Guarantee

All specifications in the MLSys Zoo are:

  • Sourced from official manufacturer datasheets and published benchmarks
  • Typed with pint.Quantity for dimensional correctness — unit errors are caught at runtime
  • Frozen per release — mlsysim==0.1.0 always produces the same answers

This means your answer key works for every student, every semester.


Jupyter & Quarto Compatibility

All tutorials run in:

  • Jupyter Notebooks — standard .ipynb workflow
  • Quarto documents — render to HTML, PDF, or slides with quarto render
  • Google Colabpip install mlsysim in the first cell, then go

No GPU runtime required. CPU-only environments work perfectly because MLSYSIM computes from equations, not empirical profiling.


Getting Started

  1. Point students to the Getting Started guide for installation
  2. Assign the Hello World tutorial as a warmup
  3. Browse the Solver Guide to select solvers for your course topics
  4. Pair each assignment with the relevant lecture slides for classroom context
  5. Use the MLSys Zoo for available hardware, model, and infrastructure specifications

Back to top