The Silicon Zoo

Vetted Specifications for AI Accelerators and Edge Devices

The Silicon Zoo is the Single Source of Truth (SSoT) for all physical hardware in mlsysim. Every specification is typed (pint.Quantity), provenance-tracked, and validated against official datasheets and MLPerf baselines—so you never have to argue about what the A100’s bandwidth actually is.

TipHow to use this page

Reference these specs when reasoning about bottlenecks. For any device listed here, you can load it directly in Python: hw = mlsysim.Hardware.Cloud.A100. The three columns that matter most for roofline analysis are Peak Performance, Memory BW, and Capacity.

Data Center Accelerators

Device Year Peak Performance Memory BW Capacity TDP
NVIDIA B200 2024 2.2 PFLOPs/s 64.0 TFLOPs/s 206.2 GB 1,000 W
Cerebras CS-3 (WSE-3) 2024 125.0 PFLOPs/s 168.0 PFLOPs/s 44.0 GB 23,000 W
NVIDIA GB200 NVL72 2024 720.0 PFLOPs/s 4.6 PFLOPs/s 13.8 TB 120 kW
NVIDIA H200 2023 989.0 TFLOPs/s 38.4 TFLOPs/s 141.0 GB 700 W
AMD MI300X 2023 1.3 PFLOPs/s 42.4 TFLOPs/s 206.2 GB 750 W
Google TPU v5p 2023 459.0 TFLOPs/s 22.1 TFLOPs/s 102.0 GB 300 W
Google TPU v5p 2023 459.0 TFLOPs/s 22.1 TFLOPs/s 102.0 GB 300 W
NVIDIA H100 2022 989.0 TFLOPs/s 26.8 TFLOPs/s 85.9 GB 700 W
NVIDIA A100 2020 312.0 TFLOPs/s 16.3 TFLOPs/s 85.9 GB 400 W
NVIDIA T4 2018 65.0 TFLOPs/s 2.6 TFLOPs/s 17.2 GB 70 W
NVIDIA V100 2017 125.0 TFLOPs/s 7.2 TFLOPs/s 34.4 GB 300 W

Workstations

Device Year Peak Performance Memory BW Capacity TDP
NVIDIA DGX Spark (GB10) 2024 250.0 TFLOPs/s 4.0 TFLOPs/s 128.0 GB 250 W
MacBook Pro (M3 Max) 2023 14.2 TFLOPs/s 3.2 TFLOPs/s 128.0 GB 100 W

Mobile Devices

Device Year Peak Performance Memory BW Capacity TDP
Google Pixel 8 (Tensor G3) 2023 15.0 TFLOPs/s 480.0 GFLOPs/s 8.0 GB 5 W
Snapdragon 8 Gen 3 2023 45.0 TFLOPs/s 616.0 GFLOPs/s 12.0 GB 5 W
iPhone 15 Pro (A17 Pro) 2023 35.0 TFLOPs/s 800.0 GFLOPs/s 8.0 GB 5 W

Edge & Robotics

Device Year Peak Performance Memory BW Capacity TDP
Edge Server 2024 1.0 TFLOPs/s 800.0 GFLOPs/s 128.0 GB 300 W
iPhone 15 Pro (A17 Pro) 2023 35.0 TFLOPs/s 800.0 GFLOPs/s 8.0 GB 5 W
NVIDIA Jetson Orin NX 2023 25.0 TFLOPs/s 816.0 GFLOPs/s 16.0 GB 25 W
Intel NUC + Movidius 2020 1.0 TFLOPs/s 200.0 GFLOPs/s 16.0 GB 15 W
Google Coral Edge TPU 2019 4.0 TFLOPs/s 64.0 GFLOPs/s 1.0 GB 2 W

TinyML Microcontrollers

Device Year Peak Performance Memory BW Capacity TDP
ESP32-S3 (AI) 2022 500.0 MFLOPs/s 1.6 GFLOPs/s 524.3 KB 1 W
ESP32-S3 (AI) 2022 500.0 MFLOPs/s 1.6 GFLOPs/s 524.3 KB 1 W
Himax WE-I Plus 2020 200.0 MFLOPs/s 800.0 MFLOPs/s 2.0 MB 0 W

How to Read the Silicon Zoo

The Three Numbers That Matter

For roofline analysis, focus on three columns:

  1. Peak Performance (TFLOP/s) — the compute ceiling. This determines how fast compute-bound workloads run (e.g., large-batch training, LLM pre-fill).

  2. Memory Bandwidth (TB/s) — the memory ceiling. This determines how fast memory-bound workloads run (e.g., small-batch inference, LLM token decoding).

  3. Capacity (GB) — the memory wall. If your model plus activations exceed this, the workload is infeasible on a single device.

The Ridge Point

The ratio of Peak Performance to Memory Bandwidth gives the ridge point (in FLOP/byte). Workloads with arithmetic intensity below the ridge point are memory-bound; above it, compute-bound. See the Math Foundations page for the full derivation.

Common Patterns

  • Cloud GPUs (A100, H100, H200) have 40-80+ GB of HBM with very high bandwidth (2-5 TB/s). They are designed for throughput.
  • Edge devices (Jetson) trade peak performance for lower power budgets, making TDP per TFLOP a useful comparison metric.
  • TinyML MCUs (RP2040, nRF5340) have KB-scale memory — only the smallest quantized models fit. Use the Model Zoo to find matching workloads.

Textbook Connection

These specifications are used throughout Volumes 1 and 2 of the textbook. The Hardware Acceleration chapter uses them for roofline construction, and the Compute Infrastructure chapter uses them for fleet sizing and TCO analysis.


NoteMissing a device?

You can define custom hardware specs on-the-fly in Python or contribute new vetted specs to the registry. See the Contributing Guide for how to add persistent specs, or the Hardware API Reference for defining custom objects.

Note: For full technical specs and validation details, see the API Reference.

Back to top