For Students
Build intuition for ML systems — without needing GPU hardware.
Whether you’re taking your first ML systems course or preparing for industry interviews, MLSys·im lets you experiment with real hardware specifications and see exactly why systems behave the way they do.
What You’ll Learn
By working through the MLSys·im tutorials and exercises, you will:
- Identify bottlenecks — Determine whether a workload is memory-bound or compute-bound on any hardware, and understand why
- Reason quantitatively — Use real datasheet numbers (not made-up examples) to calculate latency, throughput, and cost
- Build systems intuition — See how batch size, precision, parallelism strategy, and datacenter location each affect performance
- Think across the stack — Connect workload characteristics to hardware specs to infrastructure constraints
Your Learning Path
Start at the top and work through in order. Each tutorial builds on the one before it.
| Step | Tutorial | You’ll Learn | Time |
|---|---|---|---|
| 1 | Hello, Roofline | The roofline model, memory-bound vs. compute-bound, batch size sweeps | 10 min |
| 2 | The Memory Wall | Why 3.2× more FLOPS ≠ 3.2× speedup, bandwidth as binding constraint | 15 min |
| 3 | Two Phases, One Request | TTFT vs. ITL, the two phases of LLM inference | 15 min |
| 4 | KV-Cache: The Hidden Tax | KV-cache pressure, concurrent serving limits, PagedAttention | 20 min |
| 5 | Starving the GPU | CPU preprocessing bottlenecks, when the GPU starves | 15 min |
| 6 | Quantization: Not a Free Lunch | Regime-dependent speedup from quantization | 20 min |
| 7 | Scaling to 1000 GPUs | Data/tensor/pipeline parallelism, reliability at scale | 20 min |
| 8 | Geography is a Systems Variable | Energy, carbon footprint, regional grid effects | 15 min |
| 9 | The $9M Question | Inference-time compute scaling, cost of reasoning | 20 min |
| 10 | Sensitivity Analysis | Binding constraints, inverse synthesis, procurement decisions | 20 min |
Every tutorial includes “predict first” exercises. Before running code, write down what you expect. This practice builds the mental models that make you effective at systems reasoning.
How MLSys·im Pairs with the Textbook
MLSys·im is the companion framework for the Machine Learning Systems textbook. Each solver maps to specific chapters:
| Textbook Topic | MLSys·im Solver | What It Models |
|---|---|---|
| Hardware Acceleration | SingleNodeModel | Roofline analysis, compute vs. memory bottleneck |
| Model Serving | ServingModel | TTFT, ITL, KV-cache memory |
| Distributed Training | DistributedModel | 3D parallelism, all-reduce, pipeline bubbles |
| Compute Infrastructure | EconomicsModel | CapEx, OpEx, TCO |
| Sustainable AI | SustainabilityModel | Energy, carbon, water usage |
| Fault Tolerance | ReliabilityModel | MTBF, checkpoint interval |
| Scaling Physics | ScalingModel, InferenceScalingModel | Chinchilla laws, CoT cost |
| Data Engineering | DataModel, TransformationModel | I/O stalls, CPU preprocessing |
| Sensitivity Analysis | SensitivitySolver, SynthesisSolver | Binding constraints, inverse Roofline |
Not using the textbook? No problem — MLSys·im is self-contained. The Math Foundations page documents every equation.
Prerequisites
- Python: Comfortable with functions, loops, and f-strings
- Math: Basic algebra (no calculus required — all solver equations are arithmetic)
- ML: Familiarity with terms like “model parameters,” “inference,” and “training” (the Glossary defines everything else)
No GPU, no cloud account, no special hardware required. Just:
pip install mlsysimQuick Start
import mlsysim
from mlsysim import SingleNodeModel
# Load a model and hardware from the vetted registry
model = mlsysim.Models.ResNet50
gpu = mlsysim.Hardware.Cloud.A100
# Solve: is this workload memory-bound or compute-bound?
solver = SingleNodeModel()
profile = solver.solve(model=model, hardware=gpu, batch_size=1, precision="fp16")
print(f"Bottleneck: {profile.bottleneck}") # → Memory Bound
print(f"Latency: {profile.latency.to('ms'):~.2f}")Next Steps
- Getting Started — Install MLSys·im and run your first analysis
- Hello, Roofline — Your first roofline analysis
- Glossary — Look up any unfamiliar term
- Math Foundations — The equations behind every solver