Skip to content

Bit-Axon

Minimal Bits, Maximal Impulse

A 3.2B-parameter hybrid small language model engine built entirely for Apple Silicon. No GPU. No cloud. Full training, inference, and deployment on a fanless MacBook Air M4.


Install

pip install bit-axon

PyPI License


Why Bit-Axon?

Most LLMs assume you have a data center. Bit-Axon assumes you have a MacBook.

Built with Python + Apple MLX (not PyTorch), Bit-Axon runs a 24-layer hybrid architecture with Q4 quantization in roughly 1.76GB of weights. It fits in 16GB of RAM, trains without thermal throttling, and deploys as a native macOS app.

Zero infrastructure

No CUDA drivers. No cloud billing. No rented GPUs. A single Apple Silicon machine handles the entire lifecycle from data prep to inference.


Features

  Hybrid Sandwich Design

24-layer architecture stacking Axon-SSM, Sliding Window Attention, and Mixture-of-Experts in a single forward pass. Each layer type handles what it does best.

  Q4 Quantization

~1.76GB of weights in 4-bit precision. Runs comfortably on 16GB RAM with room for context windows and KV caches.

  Powermetrics-Guided Training

Reads macOS powermetrics in real time. Adapts batch sizes and learning rates to stay within thermal envelopes without manual intervention.

  10 CLI Commands

Train, evaluate, quantize, export, chat, and more. Every stage of the model lifecycle has a first-class command.

  SwiftUI macOS Application

Drop-in desktop app for inference. No terminal required. Model selection, prompt history, generation parameters, all in a native interface.

  Fully Open Source

MIT licensed. PyPI package for pip installs. HuggingFace model hub for weights and datasets. GitHub for everything else.


Quick Start

Install the package and run your first generation:

Terminal
pip install bit-axon
bit-axon run --model skyoo2003/bit-axon --prompt "Explain quantum entanglement in one sentence."

Or fire up an interactive chat session:

Terminal
bit-axon chat --model skyoo2003/bit-axon

Model weights

On first run, weights download automatically from HuggingFace. After that, everything runs locally with no network required.


Tech Stack

Component Choice
Language Python 3.11+
ML Framework Apple MLX
Architecture Axon-SSM + SWA + MoE
Parameters 3.2B
Quantization Q4 (~1.76GB)
Desktop App SwiftUI (macOS)
License MIT

GitHub HuggingFace PyPI


Documentation

Getting Started CLI Reference Architecture API Reference macOS App Guides Papers FAQ Contributing