Bit-Axon¶

Minimal Bits, Maximal Impulse

A 3.2B-parameter hybrid small language model engine built entirely for Apple Silicon. No GPU. No cloud. Full training, inference, and deployment on a fanless MacBook Air M4.

Install¶

pip install bit-axon

Why Bit-Axon?¶

Most LLMs assume you have a data center. Bit-Axon assumes you have a MacBook.

Built with Python + Apple MLX (not PyTorch), Bit-Axon runs a 24-layer hybrid architecture with Q4 quantization in roughly 1.76GB of weights. It fits in 16GB of RAM, trains without thermal throttling, and deploys as a native macOS app.

Zero infrastructure

No CUDA drivers. No cloud billing. No rented GPUs. A single Apple Silicon machine handles the entire lifecycle from data prep to inference.

Features¶

ArchitectureEfficiencyThermal AwarenessToolingNative AppOpen Stack

Hybrid Sandwich Design

24-layer architecture stacking Axon-SSM, Sliding Window Attention, and Mixture-of-Experts in a single forward pass. Each layer type handles what it does best.

Q4 Quantization

~1.76GB of weights in 4-bit precision. Runs comfortably on 16GB RAM with room for context windows and KV caches.

Powermetrics-Guided Training

Reads macOS powermetrics in real time. Adapts batch sizes and learning rates to stay within thermal envelopes without manual intervention.

10 CLI Commands

Train, evaluate, quantize, export, chat, and more. Every stage of the model lifecycle has a first-class command.

SwiftUI macOS Application

Drop-in desktop app for inference. No terminal required. Model selection, prompt history, generation parameters, all in a native interface.

Fully Open Source

MIT licensed. PyPI package for pip installs. HuggingFace model hub for weights and datasets. GitHub for everything else.

Quick Start¶

Install the package and run your first generation:

Terminal

pip install bit-axon
bit-axon run --model skyoo2003/bit-axon --prompt "Explain quantum entanglement in one sentence."

Or fire up an interactive chat session:

Terminal

bit-axon chat --model skyoo2003/bit-axon

Model weights

On first run, weights download automatically from HuggingFace. After that, everything runs locally with no network required.

Tech Stack¶

Component	Choice
Language	Python 3.11+
ML Framework	Apple MLX
Architecture	Axon-SSM + SWA + MoE
Parameters	3.2B
Quantization	Q4 (~1.76GB)
Desktop App	SwiftUI (macOS)
License	MIT

Links¶

GitHub HuggingFace PyPI

Documentation¶

Getting Started CLI Reference Architecture API Reference macOS App Guides Papers FAQ Contributing