Skip to content

Research Briefs

Archived Haiku-agent research briefs backing hwLedger's architecture decisions. Each brief answers a focused question with cited sources; the live sub-agent transcripts live at /private/tmp/claude-501/.../tasks/ and are summarised here.

#TopicADR backlinkDate
01oMlx — what it is, fork viability, competitive landscapeADR-00022026-04-18
02Rust ↔ MLX subprocess IPC patternsADR-00022026-04-18
03Inference engine landscape (mistral.rs / candle / llama.cpp / vLLM / TGI / SGLang / ExLlamaV2 / MLX / TensorRT-LLM / Ollama)PLAN §72026-04-18
04KV / state formulas per architecture (MHA, GQA, MQA, MLA, hybrid, sliding, SSM, sinks, quantisation)PLAN §5.1, ADR-0004 (pending)2026-04-18
05Model config ingestion (HF Hub, GGUF, safetensors, MLX, vLLM CLI, Ollama, LM Studio)FR-PLAN-0012026-04-18
06Rust GPU telemetry (nvml-wrapper, rocm-smi, macmon, Level-Zero)FR-TEL-0012026-04-18
07Rust ↔ Swift FFI (UniFFI + cargo-xcframework + Swift Package)ADR-00012026-04-18
08Rust ↔ WinUI FFI (csbindgen + C# .NET 9 + WinUI 3 + Velopack)ADR-00012026-04-18
09Rust ↔ Qt 6 FFI (cxx-qt + QML, Slint escape hatch)ADR-00012026-04-18
10Fleet wire — Axum + mTLS vs. gRPC; russh; Tailscale; rentalsADR-00032026-04-18
11Competing capacity planners (HF Accelerate, can-it-run-llm, LM Studio, vLLM internals)PLAN §3 #112026-04-18

How to re-run

All briefs were produced by Haiku agents in a single parallel swarm. See PLAN.md §3 for one-line summaries. To re-run a brief, spawn a Haiku-model general-purpose agent with the original prompt (reconstructable from the headings in each brief file).

Notes

  • Brief 11 was also dumped as RESEARCH_VRAM_PLANNERS.md in the repo root by the agent; that file will be moved here in the next housekeeping pass.
  • The chatdocs.md and chatdocs2.md files are the original product conversation that motivated this research. chatdocs2.md is a duplicate of chatdocs.md (same bytes) and will be removed.

Released under the Apache 2.0 License.