API Reference
Core API
Primary objects and resolvers.
| hardware | |
| models | |
| infrastructure | |
| systems | |
| platforms | Platform deployment envelopes. |
| datasets | Dataset zoo — canonical data corpus profiles. |
| literature | |
| ops | |
| core | |
| engine | |
| solvers | Canonical public solver import surface. |
| core.provenance.Provenance | How we know a numeric value (package audit trail; not BibTeX). |
| core.provenance.ProvenanceKind | |
| core.provenance.Sourced | Scalar with mandatory Provenance. Subclasses float so appendix |
| engine.calibration | Parameters for analytical solvers and the roofline engine. |
| fmt.fmt | Format a Pint Quantity (or plain number) for narrative text. |
| fmt.fmt_int | Format a value as an integer for narrative text. |
| physics | Canonical physics and accounting formulas for ML systems. |
| solvers.SingleNodeModel | Resolves single-node hardware Roofline bounds and feasibility. |
| solvers.NetworkRooflineModel | Analyzes the Distributed Performance Bounds (The Network Wall). |
| solvers.EfficiencyModel | Models the gap between peak and achieved FLOPS (Wall 3: Software Efficiency). |
| solvers.ForwardModel | Forward-evaluating mechanistic engine (Y = f(X)). |
| solvers.ServingModel | Analyzes the two-phase LLM serving lifecycle: Pre-fill vs. Decoding. |
| solvers.TrainingMemoryModel | Decomposes per-accelerator training memory into teachable components. |
| solvers.ServingCapacityModel | Sizes an LLM serving deployment from a QPS and tail-latency target. |
| solvers.ContinuousBatchingModel | Analyzes production LLM serving with Continuous Batching and PagedAttention. |
| solvers.WeightStreamingModel | Analyzes Wafer-Scale inference (e.g., Cerebras CS-3) using Weight Streaming. |
| solvers.TailLatencyModel | Analyzes queueing delays and P99 tail latency for deployed inference models. |
| solvers.DataModel | Analyzes the ‘Data Wall’ — the throughput bottleneck between storage and compute. |
| solvers.TransformationModel | Quantifies the CPU preprocessing bottleneck (Wall 9: Transformation). |
| solvers.TopologyModel | Models bisection bandwidth for different network topologies (Wall 10). |
| solvers.ScalingModel | Analyzes the ‘Scaling Physics’ of model training (Chinchilla Laws). |
| solvers.InferenceScalingModel | Models inference-time compute scaling (Wall 12: Reasoning/CoT Cost). |
| solvers.CompressionModel | Analyzes model compression trade-offs (Accuracy vs. Efficiency). |
| solvers.DistributedModel | Resolves fleet-wide communication, synchronization, and pipelining constraints. |
| solvers.MoERoutingModel | Models first-order MoE routing imbalance and expert-parallel all-to-all cost. |
| solvers.ReliabilityModel | Calculates Mean Time Between Failures (MTBF) and optimal checkpointing intervals. |
| solvers.OrchestrationModel | Analyzes Cluster Orchestration and Queueing (Little’s Law). |
| solvers.EconomicsModel | Calculates Total Cost of Ownership (TCO) including Capex and Opex. |
| solvers.SustainabilityModel | Calculates Datacenter-scale Sustainability metrics. |
| solvers.CheckpointModel | Analyzes the storage constraints and I/O burst penalties of saving model states. |
| solvers.ResponsibleEngineeringModel | Models the computational cost of responsible AI practices (Wall 20: Safety). |
| solvers.SensitivitySolver | Identifies the binding constraint via numerical sensitivity analysis (Wall 21). |
| solvers.SynthesisSolver | Given an SLA, synthesizes the required hardware specs (Wall 22: Inverse Solve). |
| solvers.ParallelismOptimizer | Searches for the optimal 3D/4D parallelism split (DP, TP, PP, EP). |
| solvers.BatchingOptimizer | Finds the maximum batch size that satisfies a P99 latency SLA. |
| solvers.PlacementOptimizer | Finds the optimal datacenter location to minimize TCO and Carbon. |
| engine.dse.DSE | Declarative Design Space Exploration (DSE) Engine. |