Bit-Axon¶
Minimal Bits, Maximal Impulse
A 3.2B-parameter hybrid small language model engine built entirely for Apple Silicon. No GPU. No cloud. Full training, inference, and deployment on a fanless MacBook Air M4.
Install¶
Why Bit-Axon?¶
Most LLMs assume you have a data center. Bit-Axon assumes you have a MacBook.
Built with Python + Apple MLX (not PyTorch), Bit-Axon runs a 24-layer hybrid architecture with Q4 quantization in roughly 1.76GB of weights. It fits in 16GB of RAM, trains without thermal throttling, and deploys as a native macOS app.
Zero infrastructure
No CUDA drivers. No cloud billing. No rented GPUs. A single Apple Silicon machine handles the entire lifecycle from data prep to inference.
Features¶
Hybrid Sandwich Design
24-layer architecture stacking Axon-SSM, Sliding Window Attention, and Mixture-of-Experts in a single forward pass. Each layer type handles what it does best.
Q4 Quantization
~1.76GB of weights in 4-bit precision. Runs comfortably on 16GB RAM with room for context windows and KV caches.
Powermetrics-Guided Training
Reads macOS powermetrics in real time. Adapts batch sizes and learning rates to stay within thermal envelopes without manual intervention.
10 CLI Commands
Train, evaluate, quantize, export, chat, and more. Every stage of the model lifecycle has a first-class command.
SwiftUI macOS Application
Drop-in desktop app for inference. No terminal required. Model selection, prompt history, generation parameters, all in a native interface.
Fully Open Source
MIT licensed. PyPI package for pip installs. HuggingFace model hub for weights and datasets. GitHub for everything else.
Quick Start¶
Install the package and run your first generation:
pip install bit-axon
bit-axon run --model skyoo2003/bit-axon --prompt "Explain quantum entanglement in one sentence."
Or fire up an interactive chat session:
Model weights
On first run, weights download automatically from HuggingFace. After that, everything runs locally with no network required.
Tech Stack¶
| Component | Choice |
|---|---|
| Language | Python 3.11+ |
| ML Framework | Apple MLX |
| Architecture | Axon-SSM + SWA + MoE |
| Parameters | 3.2B |
| Quantization | Q4 (~1.76GB) |
| Desktop App | SwiftUI (macOS) |
| License | MIT |
Links¶
Documentation¶
Getting Started CLI Reference Architecture API Reference macOS App Guides Papers FAQ Contributing