Machine Learning Systems
  • Textbook
    • Home

    • Volume I: Foundations
    • Volume II: At Scale

    • TinyTorch
    • Hardware Kits
    • MLSYSIM

    • Labs (Coming 2026)

    • Newsletter
  • Downloads
    • Volume I PDF
    • Volume I EPUB

    • Volume II PDF
  • Star
  • Support
  • Subscribe
  • GitHub
    • Edit this page
    • Report an issue
    • Discussions
    • View source

Machine Learning Systems

TWO-VOLUME TEXTBOOK

Machine Learning
Systems.

Two volumes. One curriculum.
The physics of AI engineering.

A rigorous, principles-first treatment of how ML systems
are built, optimized, and deployed—from a single
machine to fleet-scale infrastructure.

TinyTorch · MLSys·im · Labs · Hardware Kits

GitHub · Open Collective

Volume I cover

Volume I

Introduction to Machine Learning Systems

HTML PDF EPUB

Volume II cover

Volume II

Machine Learning Systems at Scale

HTML PDF EPUB

Scroll to Explore
↓

TINYTORCH

Build it.
From scratch.

20 interactive modules.
Zero magic.

Understand the inner workings of modern ML frameworks by building your own tensor library, automatic differentiation engine, and neural network modules in Python.

Start Building →

class Tensor:
def __init__(self, data):
self.data = data
self.grad = 0.0
self._backward = lambda: None

A pedagogical framework for learning ML systems engineering.

MLSYS·IM

Model the
trade-offs.

One command.
Every bottleneck.

A first-principles modeling engine for reasoning about ML system performance. Evaluate training, serving, and distributed configurations before committing hardware or code.

Explore MLSys·im →

$ mlsysim serve llama-3-70b --hw h100
70B mem-bound compute-bound Arithmetic Intensity FLOP/s
ModelLlama-3-70B HWH100 (80 GB) Batch32 PrecisionBF16
MFU 48% HBM 81% TTFT 112 ms

Configure. Model. See every bottleneck before committing hardware.

INTERACTIVE LABS

Learn by
doing.

Jupyter and Marimo.
Coming Summer 2026.

A complete suite of interactive notebooks designed to accompany the textbook. Profile performance, optimize kernels, and explore distributed training configurations.

View Labs →

Lab 08 · Training Memory Act II
16
128
64
256
Batch
128
HBM Allocation 94.2 / 80 GB
W
Activations
Opt
+14
Exceeded 80 GB HBM capacity
OOM — infeasible Reduce batch size or enable activation checkpointing
Marimo Prediction: 0/1

Predict, explore, and break ML systems through interactive notebooks.

HARDWARE KITS

Deploy to
the edge.

Real silicon.
Real constraints.

Take your models out of the cloud and into the physical world. Hands-on deployment labs using Arduino, Raspberry Pi, and Seeed Studio hardware.

Explore Kits →

Nicla Vision

Microcontrollers, single-board computers, and specialized accelerators.

Vijay Janapa Reddi, Harvard University · MIT Press 2026

© 2024-2026 Harvard University. Licensed under CC-BY-NC-SA 4.0

Volume I · Volume II