For Students

Build intuition for ML systems – without needing GPU hardware.

Whether you are taking your first ML systems course or preparing for industry interviews, MLSys·im lets you experiment with real hardware specifications and see why systems behave the way they do. Hardware registry specs come from datasheets where available; outputs are derived from documented equations.

What You Will Learn

By working through the MLSys·im tutorials and exercises, you will:

Identify bottlenecks – Determine whether a workload is memory-bound or compute-bound on any hardware, and understand why
Reason quantitatively – Use real datasheet numbers (not made-up examples) to calculate latency, throughput, and cost
Build systems intuition – See how batch size, precision, parallelism strategy, and datacenter location each affect performance
Think across the stack – Connect workload characteristics to hardware specs to infrastructure constraints

Prerequisites

Python: Comfortable with functions, loops, and f-strings
Math: Basic algebra (no calculus required – all solver equations are arithmetic)
ML: Familiarity with terms like “model parameters,” “inference,” and “training” (the Glossary defines everything else)

No GPU, no cloud account, no special hardware required. Just:

pip install mlsysim

See the Getting Started guide for development installs. Local install is the supported path; Colab works by running pip install mlsysim in the first cell, but one-click Colab/Binder launch buttons are not available yet.

Quick Start

import mlsysim
from mlsysim import Engine

# Load a model and hardware from the vetted registry
model = mlsysim.Models.Vision.ResNet50
gpu   = mlsysim.Hardware.Cloud.A100

# Solve: is this workload memory-bound or compute-bound?
profile = Engine.solve(model=model, hardware=gpu, batch_size=1, precision="fp16")

print(f"Bottleneck: {profile.bottleneck}")   # → Memory
print(f"Latency:    {profile.latency.to('ms'):~.2f}")

Your Learning Path

Start at the top and work through in order. Each tutorial builds on the one before it. The Companion Slides column links directly to the lecture deck that covers the same material – use them for visual explanations, worked examples, and active learning exercises.

Step	Tutorial	You Will Learn	Time	Companion Slides
1	Hello, Roofline	The roofline model, memory-bound vs. compute-bound, batch size sweeps	15 min	Hardware Acceleration (Vol I, Ch 11)
2	Geography is a Systems Variable	Energy, carbon footprint, regional grid effects	20 min	Sustainable AI (Vol II, Ch 15)
3	Two Phases of Inference	TTFT vs. ITL, KV-cache pressure, the two phases of LLM inference	25 min	Model Serving (Vol I, Ch 13) and Inference at Scale (Vol II, Ch 10)
4	Distributed Training	Data/tensor/pipeline parallelism, communication overhead, scaling efficiency	30 min	Distributed Training (Vol II, Ch 5) and Collective Communication (Vol II, Ch 6)

Predict Before You Compute

Every tutorial includes “predict first” exercises. Before running code, write down what you expect. This practice builds the mental models that make you effective at systems reasoning. The companion slide decks include the same predict-first methodology with 8–11 active learning moments per deck.

How MLSys·im Maps to the Textbook and Slides

MLSys·im is the companion framework for the Machine Learning Systems textbook. Each solver maps to specific chapters and slide decks. Use the slide links below to review the theory before (or after) running the solver.

MLSys·im Solver	What It Models	Textbook Topic	Slide Deck
SingleNodeModel	Roofline analysis, compute vs. memory bottleneck	Hardware Acceleration	Vol I, Ch 11
ServingModel	TTFT, ITL, KV-cache memory	Model Serving	Vol I, Ch 13
TrainingMemoryModel	Training memory breakdown	Training	Vol I, Ch 8
ServingCapacityModel	Replica count from QPS and P99 target	Inference at Scale	Vol II, Ch 10
DistributedModel	3D parallelism, all-reduce, pipeline bubbles	Distributed Training	Vol II, Ch 5
MoERoutingModel	Expert routing imbalance and all-to-all traffic	Distributed Training	Vol II, Ch 5
EconomicsModel	CapEx, OpEx, TCO	Compute Infrastructure	Vol II, Ch 2
SustainabilityModel	Energy, carbon, water usage	Sustainable AI	Vol II, Ch 15
ReliabilityModel	MTBF, checkpoint interval	Fault Tolerance	Vol II, Ch 7

Not using the textbook? No problem – MLSys·im is self-contained. The Math Foundations page documents the core equations, and each slide deck stands on its own with full speaker notes.

Recommended Study Workflow

Whether you are self-studying or following a course, this workflow maximizes retention:

Read the textbook chapter (or skim the slide deck) to get the conceptual framework
Predict what will happen before running any code – write it down
Model the setup using MLSys·im to test your prediction against real hardware specs
Explore by changing one parameter at a time (batch size, precision, hardware) and observing the effect
Reflect on where your prediction was wrong – that gap is where learning happens

Self-Study vs. Classroom

If you are self-studying, the slide decks include speaker notes with timing guidance, teaching tips, and common misconceptions – they are written to be useful even without an instructor. If you are in a course, your instructor may assign specific tutorials as homework; check the Instructor Guide for the recommended pairing.

Slides at a Glance

The full slide collection covers both volumes of the textbook. Every deck includes speaker notes, active learning exercises, and original SVG diagrams.

Volume I: Foundations (17 decks, 570 slides)

Covers the single-machine ML stack: data engineering, neural computation, architectures, frameworks, training, compression, hardware acceleration, serving, and operations.

Browse Vol I Decks | Download All (PDF)

Volume II: At Scale (18 decks, 529 slides)

Covers distributed infrastructure: compute clusters, network fabrics, distributed training, fault tolerance, fleet orchestration, inference at scale, and governance.

Browse Vol II Decks | Download All (PDF)

Next Steps

Getting Started – Install MLSys·im and run your first analysis
Hello, Roofline Tutorial – Your first roofline analysis
Solver Guide – Deep dive into each solver’s capabilities
Glossary – Look up any unfamiliar term
Math Foundations – The equations behind every solver
All Slide Decks – 35 Beamer decks with speaker notes and active learning exercises