CLI Reference

Every command, every flag, with real examples.

MLSys·im ships an automation-friendly CLI built on Typer and Rich. It follows the 3-Tier Command Mapping: eval maps to Models, optimize maps to Optimizers, and zoo maps to the registries.

Output Formats

Commands support -o json for machine-parseable output; report-oriented commands also support -o markdown. The default is text (human-readable Rich tables). Use -o json in scripts and CI jobs. You can place the output flag before the command (mlsysim -o json eval ...) or on the command itself (mlsysim eval ... -o json).

Quick Examples

# What's in the Zoo?
mlsysim zoo hardware
mlsysim zoo models

# Single-node roofline: is Llama-3 8B memory-bound on H100?
mlsysim eval Llama3_8B H100

# Same thing, but with batch size 32 and fp8 precision
mlsysim eval Llama3_8B H100 --batch-size 32 --precision fp8

# Full cluster evaluation from a YAML spec
mlsysim eval cluster.yaml

# Machine-readable JSON for CI/CD pipelines
mlsysim eval Llama3_8B H100 -o json

# Export JSON Schema for IDE autocompletion
mlsysim schema --type hardware > hardware.schema.json

Exit Codes

The CLI uses semantic exit codes so scripts and CI pipelines can react programmatically:

Code	Meaning	Example
`0`	Success	Analysis completed, all assertions passed
`1`	Bad input	Unknown model name, malformed YAML, missing required flag
`2`	Physics violation	A hard `OOMError` raised during evaluation
`3`	SLA violation	A `constraints.assert` check in the YAML failed

mlsysim eval NoSuchModel H100
echo $?  # → 1 (unknown model name)

mlsysim eval cluster.yaml   # with a constraints.assert block that fails
echo $?  # → 3

Feasibility failures exit 0

A quick evaluation of an infeasible configuration (e.g. mlsysim eval Llama3_70B T4) renders the scorecard with Feasibility [FAIL] and exits 0 — the analysis itself succeeded. Exit code 2 is reserved for hard out-of-memory errors raised inside the engine. To gate CI on feasibility, use a YAML plan with a constraints.assert block (exit code 3 on violation) or parse the -o json output.

Global Options

mlsysim [OPTIONS] COMMAND [ARGS]...

Flag	Description	Default
`-o, --output`	Output format: `text`, `json`, `markdown`; `html` is available for `eval` and `optimize`	`text`
`--install-completion`	Install shell completion (bash, zsh, fish)	—
`--show-completion`	Print completion script to stdout	—
`--help`	Show help and exit	—

`mlsysim zoo`

Explore the built-in registries (the MLSys Zoo).

mlsysim zoo [CATEGORY]

Arguments:

Argument	Description
`CATEGORY`	`hardware` or `models`; omit to list both registries

Examples:

# List all hardware in the Zoo with specs
mlsysim zoo hardware

# List all models with parameter counts and FLOPs
mlsysim zoo models

# JSON output for scripting
mlsysim zoo hardware -o json

`mlsysim eval`

Evaluate the analytical physics of an ML system. This is the primary command — it runs the roofline analysis and returns bottleneck, latency, throughput, and memory usage.

mlsysim eval [OPTIONS] TARGET [HARDWARE]

Arguments:

Argument	Description	Required
`TARGET`	Model name (e.g., `Llama3_8B`) or path to `mlsys.yaml`	Yes
`HARDWARE`	Hardware name (e.g., `H100`) — required when TARGET is a model name	Conditional

Options:

Flag	Description	Default
`-b, --batch-size`	Batch size	`1`
`-p, --precision`	Numerical precision: `fp32`, `fp16`, `fp8`, `int8`, `int4`	`fp16`
`-e, --efficiency`	Model FLOPs Utilization (0.0–1.0)	`0.5`
`-o, --output`	Output format: `text`, `json`, `markdown`, or `html`	`text`

Examples:

# Quick check: is ResNet-50 memory-bound on A100?
mlsysim eval ResNet50 A100

# LLM inference at batch 1 (typical serving scenario)
mlsysim eval Llama3_8B H100 --batch-size 1 --precision fp16

# Quantized inference
mlsysim eval Llama3_8B H100 --batch-size 32 --precision int8 --efficiency 0.35

# Full cluster evaluation with SLA assertions
mlsysim eval cluster.yaml

# JSON for CI/CD — fails with exit code 3 if SLA assertions fail
mlsysim eval cluster.yaml -o json

YAML Cluster Evaluation

When TARGET is a YAML file, eval runs the full 3-lens scorecard (Feasibility, Performance, Macro) including distributed training, economics, and sustainability analysis.

version: "1.0"
name: "llama70b-cluster"
workload:
  name: "Llama3_70B"
  batch_size: 4096
hardware:
  name: "H100"
  accelerators: 64
ops:
  region: "Quebec"
  duration_days: 14.0
constraints:
  assert:
    - metric: "performance.step_latency"
      max: 50.0

Both version and name are required top-level fields. Assertion metrics are addressed as <lens>.<metric>; the available keys are performance.step_latency, performance.comm_overhead, performance.fleet_throughput, performance.mfu, performance.node_mfu, performance.scaling_efficiency, macro.tco_usd, macro.carbon_footprint, macro.energy_cost, and macro.capex.

`mlsysim serve`

Analyze LLM serving performance directly. Use this when you care about the two-phase serving lifecycle rather than generic single-node roofline throughput.

mlsysim serve [OPTIONS] MODEL HARDWARE

Arguments:

Argument	Description	Required
`MODEL`	Transformer model name (e.g., `Llama3_8B`)	Yes
`HARDWARE`	Hardware name (e.g., `H100`)	Yes

Options:

Flag	Description	Default
`-s, --seq-len`	Sequence length / context window	`2048`
`-b, --batch-size`	Batch size	`1`
`-p, --precision`	Numerical precision: `fp32`, `fp16`, `fp8`, `int8`, `int4`	`fp16`
`-e, --efficiency`	Compute efficiency (0.0-1.0)	`0.5`
`--prefill-chunk-tokens`	Chunk prefill by this token budget to estimate a max decode-stall proxy	none
`-o, --output`	Output format: `text`, `json`, or `markdown`	`text`

Examples:

# TTFT, ITL, KV-cache, and memory feasibility
mlsysim serve Llama3_8B H100 --seq-len 4096

# Sarathi-Serve-style chunked prefill proxy for long prompts
mlsysim serve Llama3_8B H100 --seq-len 8192 --prefill-chunk-tokens 512 -o json

`mlsysim schema`

Export JSON Schema for configuration files. Feed these to your IDE or validation tooling for autocompletion and static checks.

mlsysim schema [OPTIONS]

Options:

Flag	Description	Default
`-t, --type`	Schema type: `hardware`, `workload`, or `plan`	`plan`
`-o, --output`	Accepted values: `text` or `json`; schema output is always JSON	`text`

Examples:

# Get the hardware YAML schema for IDE autocompletion
mlsysim schema --type hardware > hardware.schema.json

# Get the workload schema
mlsysim schema --type workload > workload.schema.json

# Get the full cluster plan schema (for mlsys.yaml files)
mlsysim schema --type plan > plan.schema.json

`mlsysim optimize`

Search the design space for optimal configurations. Each subcommand maps to an Optimizer in the 3-Tier architecture.

mlsysim optimize COMMAND [ARGS]...

`mlsysim optimize parallelism`

Find the optimal (TP, PP, DP) split to maximize Model FLOPs Utilization.

mlsysim optimize parallelism CONFIG_FILE

Argument	Description	Required
`CONFIG_FILE`	Path to `mlsys.yaml` with fleet definition	Yes

Example:

# Find the best parallelism strategy for a 70B model on 256 H100s
mlsysim optimize parallelism cluster.yaml

`mlsysim optimize batching`

Find the maximum safe batch size that satisfies a P99 latency SLA.

mlsysim optimize batching [OPTIONS] CONFIG_FILE

Flag	Description	Required
`--sla-ms`	P99 latency SLA in milliseconds	Yes
`--qps`	Arrival rate in queries per second	Yes

Example:

# Max batch size for 50ms P99 at 100 QPS
mlsysim optimize batching cluster.yaml --sla-ms 50 --qps 100

`mlsysim optimize placement`

Find the optimal datacenter region to minimize TCO and carbon footprint.

mlsysim optimize placement [OPTIONS] CONFIG_FILE

Flag	Description	Default
`--carbon-tax`	Carbon tax penalty in $/ton CO₂	`100.0`

Example:

# Find cheapest region with $150/ton carbon penalty
mlsysim optimize placement cluster.yaml --carbon-tax 150

`mlsysim audit`

Profile a workload against the Iron Law and report which wall binds.

mlsysim audit [OPTIONS]

Flag	Description	Default
`-w, --workload`	Workload name to audit against (e.g. `Llama3_8B`, `ResNet50`)	`Llama3_8B`

Bring Your Own YAML

Instead of using registry names, you can pass custom hardware or workload YAML files directly to eval:

# Custom chip spec against a Zoo model
mlsysim eval Llama3_8B ./my_custom_chip.yaml --batch-size 32

# Both custom
mlsysim eval ./my_model.yaml ./my_chip.yaml

See Getting Started — Defining Custom Models for the model definition format.