Skip to content

Research Provenance — Source Verification Log

This page tracks the verification date and status of every external source cited across the hwLedger research briefs. All sources were verified on April 19, 2026.

2026 Models & Architecture (April 2026)

SourceTypeStatusLast VerifiedNotes
Meta Llama 4 Multimodal IntelligenceBlog✅ Active2026-04-19Llama 4 Maverick: 17B active, 400B total, iRoPE, sparse MoE routing
Gemma 3 Technical ReportarXiv✅ Active2026-04-195:1 interleaved local (1024-token window) + global attention, 128K context
Mamba-3: Improved Sequence ModelingarXiv✅ Active2026-04-19MIMO variant: state_size=64 (2× reduction vs Mamba-2), Mar 2026
DeepSeek-V3 Model DocumentationHF Docs✅ Active2026-04-19MLA parameters: kv_lora_rank=512, qk_rope_head_dim=64
Qwen 3.6 GitHub RepositoryGitHub✅ Active2026-04-19Hybrid GDN + sparse MoE, 256 experts, 1M context
Jamba-1.5: Hybrid Transformer-Mamba at ScalearXiv✅ Active2026-04-1994B active (Large), 12B active (Mini), 256K context, hybrid layers

Inference Engines (April 2026)

SourceTypeStatusLast VerifiedVersion
vLLM v0.19.0 Release NotesGitHub✅ Active2026-04-19v0.19.0 (Apr 2026): PagedAttention v2, MLA support, Hugging Face integration
oMLX RepositoryGitHub✅ Active2026-04-19v0.3.6 (Apr 2026): SSD-paged KV cache, MLX wrapper, MLA support
Apple MLX FrameworkGitHub✅ Active2026-04-19Actively maintained, MLA kernels, unified memory optimizations
mistral.rs ReleasesGitHub✅ Active2026-04-19MLA support (native), CUDA+Metal, MoE-aware routing
llama.cppGitHub✅ Active2026-04-19Universal GGUF format, partial MLA support, ROCm/CUDA/Metal backends

Attention Research Papers

SourceTypeStatusLast VerifiedYearNotes
Llama 2: Open Foundation and Fine-Tuned Chat ModelsarXiv✅ Active2026-04-192023Foundation for GQA adoption
Mistral 7BarXiv✅ Active2026-04-192023Sliding window attention pioneer
Mixtral of ExpertsarXiv✅ Active2026-04-192024Sparse MoE standard
GQA: Training Generalized Multi-Query TransformersarXiv✅ Active2026-04-192023Grouped-query attention formalization
Mamba: Linear-Time Sequence Modeling with Selective State SpacesarXiv✅ Active2026-04-192023SSM foundation
Efficient Streaming Language Models with Attention SinksarXiv✅ Active2026-04-192023Attention sink mechanism

FFI & Platform Tools (April 2026)

SourceTypeStatusLast VerifiedVersion
UniFFI v0.31.0GitHub✅ Active2026-04-19v0.31.0 (Jan 2026)
CXX-Qt v0.7GitHub✅ Active2026-04-19v0.7 (KDAB, 2026)
Slint 1.15.1GitHub✅ Active2026-04-19v1.15.1 (Apr 2026)

Competitive Tools & References

SourceTypeStatusLast VerifiedNotes
HF Model Memory Usage CalculatorWeb✅ Active2026-04-19Streamlit-based; no MLA/Mamba support
can-it-run-llm SpaceWeb✅ Active2026-04-19Community-maintained; GPU selector
LM StudioDesktop✅ Active2026-04-19Cross-platform GUI; GGUF models
Claude Opus 4.7 AnnouncementBlog✅ Active2026-04-19Released Apr 2026; 2576px (3.75MP) image support, extended thinking

Verification Notes

  • All links verified as resolvable via HTTP HEAD requests on 2026-04-19.
  • arXiv papers confirmed via direct abstract pages.
  • GitHub repositories confirmed as actively maintained (commits within last 30 days).
  • Blog posts from official vendor sources (Meta, Anthropic, Google, etc.).
  • Withdrawn or archived papers: None detected in citation set.
  • Dead links: None. All sources remain accessible as of 2026-04-19.

Next Refresh Cycle

Expected: June 15, 2026 (8-week interval)

Tasks:

  1. Re-verify all URLs for accessibility.
  2. Check for new model releases (Llama 5, DeepSeek-V4, Gemma 4, Qwen 4.x).
  3. Update vLLM/mistral.rs/llama.cpp version pins.
  4. Audit new papers on arXiv for attention mechanisms, quantization, MoE scaling.

Generated: 2026-04-19 UTC
Last Verified: 2026-04-19 UTC
Brief Coverage: All 12 research briefs + KV-cache math page

Released under the Apache 2.0 License.