hwLedger

📊

Capacity Planning

Predict VRAM and throughput for any model, correctly handling MoE, MLA, GQA, and hybrid architectures with live per-layer breakdown.

🔍

Live Telemetry

Reconcile predictions against live telemetry from MLX, mistral.rs, llama.cpp, vLLM, or TGI with verified accuracy.

🚀

Local Inference

Run inference locally on Apple Silicon via forked oMlx sidecar with SSD-paged KV cache or portable mistral.rs engine.

🏗️

Fleet Ledger

Track heterogeneous fleets (NVIDIA/AMD boxes, Apple Silicon, cloud rentals) with shared event-sourced audit log and cost model.

🎯

Native GUIs

Per-OS native interfaces (SwiftUI, WinUI 3, Qt 6) over shared Rust FFI core with unified planner UX across platforms.

⚙️

Enterprise Ready

Production-grade architecture with mTLS fleet wire, structured event sourcing, and per-device provisioning and cost tracking.

Why hwLedger

Every public VRAM calculator (HF Accelerate, can-it-run-llm, LM Studio) gets MoE and MLA wrong. They undercount KV cache and overcount MoE throughput. hwLedger's math core is architecture-keyed: it dispatches per AttentionKind and treats resident-vs-active parameters separately for MoE.

The result: hobbyist-sized fleet with enterprise bones.

Quick Start

1. Clone the repository

bash

git clone https://github.com/KooshaPari/hwLedger.git
cd hwLedger

2. Build from source

bash

cargo build --release

3. Run the planner

bash

cargo run --bin hwledger-cli -- plan --model llama-2-70b

Live memory planning with colored VRAM breakdown

Keyframes (0 — VLM-friendly)

Loading manifest…

Download MP4 Download GIF Keyframes (repo)

Phase	Status
P0 Foundation	in progress
P1 Math core	planned
P2 Ingestion + probe	planned
P3 macOS GUI MVP	planned
P4 Inference	planned (macOS only in MVP)
P5 Fleet	planned
P6 Windows GUI	deferred
P7 Linux GUI	deferred

Phase

Status

P0 Foundation

in progress

P1 Math core

planned

P2 Ingestion + probe

planned

P3 macOS GUI MVP

planned

P4 Inference

planned (macOS only in MVP)

P5 Fleet

planned

P6 Windows GUI

deferred

P7 Linux GUI

deferred

Documentation Structure

Architecture — System design, component map, and architecture decisions

Math Core — KV cache formulas with per-architecture derivations

Fleet Ledger — Fleet wire, cost model, and audit log design

Getting Started — Installation and first steps

Research — Archived research briefs on MLX, GPU telemetry, FFI patterns, and more

Tech Stack

Core: Rust workspace (hwledger-core, -arch, -ingest, -probe, -inference, -ledger, -fleet-proto, -agent, -server, -cli, -ffi)

Sidecar: oMlx fork (Apache-2.0, Python/PyObjC) with SSD-paged KV cache

Native apps: SwiftUI (macOS) + UniFFI, WinUI 3 + .NET 9 (Windows), Qt 6 + cxx-qt (Linux)

Fleet wire: Axum + rustls mTLS, russh agentless, reqwest cloud provider integrations

hwLedgerLLM Capacity Planner + Fleet Ledger

Capacity Planning

Live Telemetry

Local Inference

Fleet Ledger

Native GUIs

Enterprise Ready

Why hwLedger

Quick Start

1. Clone the repository

2. Build from source

3. Run the planner

Project Status

Documentation Structure

Tech Stack

License

hwLedgerLLM Capacity Planner + Fleet Ledger

Capacity Planning

Live Telemetry

Local Inference

Fleet Ledger

Native GUIs

Enterprise Ready

Why hwLedger ​

Quick Start ​

1. Clone the repository ​

2. Build from source ​

3. Run the planner ​

Project Status ​

Documentation Structure ​

Tech Stack ​

License ​

Why hwLedger

Quick Start

1. Clone the repository

2. Build from source

3. Run the planner

Project Status

Documentation Structure

Tech Stack

License