For Instructors

Reproducible, hardware-independent exercises — paired with 35 lecture decks and 266 diagrams.

MLSys·im provides a framework for assigning analytically grounded problem sets where every answer is deterministic and reproducible — regardless of what hardware your students have access to. Combined with the companion lecture slides, it forms a complete teaching toolkit for ML systems courses.

Why MLSys·im for Teaching?

Challenge	How MLSys·im Helps
Students lack GPU access	All analysis runs on a laptop — no cloud credits needed
Homework answers vary by hardware	Vetted registry specs produce identical results everywhere
Hard to grade open-ended systems questions	Analytical solvers give deterministic, verifiable outputs
Specifications become stale	Registry updated from official datasheets; one update propagates everywhere
Students memorize without understanding	“Predict first” exercises build genuine intuition
No time to build slides from scratch	35 Beamer decks with speaker notes, active learning, and SVG diagrams ready to use

The Teaching Ecosystem

MLSys·im is one component of a larger open teaching toolkit:

Resource	What It Provides	Link
Textbook	Two-volume open textbook — foundations (Vol I) and scale (Vol II)	mlsysbook.ai
Lecture Slides	35 Beamer decks, 1,099 slides, 266 SVG diagrams, speaker notes on every slide	Slides Portal
MLSys·im	Six primary workflows backed by typed resolver classes, a hardware registry, and deterministic assignments	Getting Started
TinyML Courseware	4-course sequence with 178 slide decks for embedded ML	TinyML Slides
Teaching Guide	16-week semester plans, active learning taxonomy, customization guide	Teaching Guide

Course Integration Patterns

Pattern 1 — Textbook Companion (Full Semester)

Map MLSys·im tutorials and assignments directly to textbook chapters and lecture decks. The table below shows one possible 16-week arrangement using Volume I slides.

Week	Lecture Slides	Textbook Topic	MLSys·im Assignment
2	Introduction	The Iron Law of ML Systems	Read Hello, Roofline warmup — identify bottleneck equation
5	NN Computation	FLOPs, memory footprint	Hello, Roofline — roofline analysis, batch size sweep
8	Model Training	Training memory budget	Solver Guide — `TrainingMemoryModel`, ZeRO stages
11	HW Acceleration	Roofline model, accelerator comparison	Hardware comparison assignment (see below)
13	Model Serving	TTFT, ITL, KV-cache	Two Phases of Inference — serving latency analysis

For a Volume II course on distributed systems:

Week	Lecture Slides	Textbook Topic	MLSys·im Assignment
3	Compute Infrastructure	GPU clusters, interconnects	TCO analysis with EconomicsModel
5	Distributed Training	3D parallelism, scaling	Scaling to 1000 GPUs — parallelism strategies
7	Fault Tolerance	Checkpointing, MTBF	ReliabilityModel — Young-Daly checkpoint interval
10	Performance Engineering	Profiling, optimization	Multi-solver composition (see capstone ideas below)
15	Sustainable AI	Energy, carbon, water	Geography is a Systems Variable — carbon footprint

Semester Plans

The Teaching Guide provides complete 16-week schedules for Volume I, Volume II, and a combined 32-week sequence — with timing estimates for every deck.

Pattern 2 — Standalone Labs

Use individual tutorials as self-contained lab assignments in any systems course. Each tutorial includes exercises with clear expected outputs:

Tutorial	Duration	Key Concepts	Pairs With Slides
Hello, Roofline	15 min	Roofline model, memory vs. compute bound	HW Acceleration
Geography is a Systems Variable	20 min	Energy, carbon footprint, regional grids	Sustainable AI
Two Phases of Inference	25 min	TTFT vs. ITL, KV-cache pressure	Model Serving
Scaling to 1000 GPUs	30 min	Data/tensor/pipeline parallelism	Distributed Training

Pattern 3 — Capstone Projects

Advanced students compose multiple solvers to answer research-style questions. See Writing a Custom Solver for the custom solver API.

Assignment Ideas

Homework: Hardware Comparison (30 min)

Using Engine.solve(), compare ResNet-50 inference latency on the A100, H100, and Jetson AGX at batch sizes 1, 32, and 256. For each configuration, state whether the workload is memory-bound or compute-bound and explain why the bottleneck shifts with batch size.

Pairs with: HW Acceleration slides (roofline model, ridge point) and Benchmarking slides (measurement methodology).

Homework: Training Memory Budget (30 min)

Using TrainingMemoryModel, calculate the memory required to train GPT-2 (1.5B parameters) in FP16 with Adam optimizer under ZeRO Stage 0, Stage 1, and Stage 3. Explain why each stage reduces memory and what trade-off it introduces.

Pairs with: Model Training slides and Distributed Training slides.

Lab: Carbon-Aware Training (45 min)

Using the SustainabilityModel, calculate the carbon footprint of training GPT-3 on a 256-GPU H100 cluster in Quebec vs. US Average vs. Poland. Produce a table and a 2-paragraph analysis of why datacenter location matters more than hardware choice for carbon.

Pairs with: Sustainable AI slides (grid carbon intensity, PUE).

Lab: LLM Serving Capacity Planning (45 min)

Using the ServingModel, determine the maximum sequence length at which Llama-3.1-70B can serve a single request on an 8-GPU H100 node without exceeding memory. Then calculate TTFT and ITL at sequence lengths of 1K, 4K, and 16K tokens. At what point does KV-cache pressure dominate?

Pairs with: Model Serving slides and Inference at Scale slides.

Exam Question: Back-of-Envelope

A GPU has 1,979 TFLOP/s peak compute (FP16) and 3.35 TB/s memory bandwidth. (a) What is the ridge point in FLOP/Byte? (b) A model layer has arithmetic intensity of 50 FLOP/Byte — is it compute-bound or memory-bound? (c) Another layer has arithmetic intensity of 400 FLOP/Byte — which regime is it in, and what does that imply about the benefit of moving to a GPU with 2x the bandwidth? Show your work.

Pairs with: HW Acceleration slides (roofline model, ridge point derivation).

Capstone: Multi-Solver Design Study (1 week)

Design a training cluster for a 70B-parameter model. Use the DistributedModel to select a parallelism strategy, the EconomicsModel for TCO over 6 months, the SustainabilityModel to compare three datacenter locations, and the ReliabilityModel to determine checkpoint frequency. Present your analysis as a 3-page technical memo with quantitative justification for each decision.

Pairs with: the full Volume II slide set — infrastructure, training, fault tolerance, and sustainability.

Grading Notes

Because MLSys·im produces deterministic output from vetted specifications:

Answer keys are stable — the same mlsysim version produces identical numbers for every student, every semester
Partial credit is straightforward — grade the reasoning (which solver, which inputs, which bottleneck explanation), not just the number
“Predict first” questions are easy to assess — students submit their prediction before running code; compare the two for a conceptual understanding score

Version Pinning

Pin the exact version in your assignment instructions (pip install mlsysim==<course-version>) so answer keys remain valid even after new releases update specifications.

Reproducibility Guarantee

All specifications in the MLSys Zoo are:

Sourced from official manufacturer datasheets and published benchmarks
Typed with pint.Quantity for dimensional correctness — unit errors are caught at runtime
Frozen per release — an exact pin such as mlsysim==0.1.2 always produces the same answers

This means your answer key works for every student, every semester.

Jupyter & Quarto Compatibility

All tutorials run in:

Jupyter Notebooks — standard .ipynb workflow
Quarto documents — render to HTML, PDF, or slides with quarto render
Google Colab — pip install mlsysim in the first cell, then go

No GPU runtime required. CPU-only environments work perfectly because MLSys·im computes from equations, not empirical profiling.

Getting Started

Point students to the Getting Started guide for installation
Assign the Hello, Roofline tutorial as a warmup
Browse the Solver Guide to select solvers for your course topics
Pair each assignment with the relevant lecture slides for classroom context
Use the MLSys Zoo for available hardware, model, and infrastructure specifications

Why MLSys·im for Teaching?

The Teaching Ecosystem

Course Integration Patterns

Pattern 1 — Textbook Companion (Full Semester)

Pattern 2 — Standalone Labs

Pattern 3 — Capstone Projects

Assignment Ideas

Homework: Hardware Comparison (30 min)

Homework: Training Memory Budget (30 min)

Lab: Carbon-Aware Training (45 min)

Lab: LLM Serving Capacity Planning (45 min)

Exam Question: Back-of-Envelope

Capstone: Multi-Solver Design Study (1 week)

Grading Notes

Reproducibility Guarantee

Jupyter & Quarto Compatibility

Getting Started

Related Resources