Contributing GuideΒΆ
Thanks for your interest in contributing to Bit-Axon. This is an open-source small language model engine built for Apple Silicon, and we welcome patches, bug fixes, and new features from the community.
Before participating, please read our Code of Conduct.
PrerequisitesΒΆ
- macOS on Apple Silicon (M1 or later)
- Python 3.10+
- Apple MLX 0.31.0+ (installs automatically as a dependency)
- Git
A recent version of Xcode Command Line Tools is also recommended (xcode-select --install).
Development SetupΒΆ
# Clone the repository
git clone https://github.com/skyoo2003/bit-axon.git
cd bit-axon
# Create a virtual environment
python -m venv .venv
source .venv/bin/activate
# Install in editable mode with dev dependencies
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit install
# Verify the installation
python -c "import bit_axon; print(bit_axon.__version__)"
The [dev] extra pulls in pytest, pytest-xdist, ruff, and pre-commit. All project configuration lives in pyproject.toml.
Tip
Pre-commit hooks run automatically on every commit. They catch trailing whitespace, missing newlines at end of file, YAML/TOML syntax errors, and Ruff lint issues before they reach CI.
Development CommandsΒΆ
| Command | Description |
|---|---|
pytest tests/ | Run the full test suite |
pytest -n auto tests/ | Run tests in parallel |
pytest tests/test_model.py | Run a single test file |
pytest -k TestAxonSSM | Run tests matching a name pattern |
ruff check src/ tests/ | Lint for errors and style issues |
ruff check src/ tests/ --fix | Auto-fix lint issues where possible |
ruff format src/ tests/ | Format code |
ruff format --check src/ tests/ | Check formatting without writing |
Project StructureΒΆ
bit-axon/
βββ src/bit_axon/ # Package source
β βββ config.py # Model configuration
β βββ model.py # Top-level model definition
β βββ layers/ # Model layers (SSM, block, MoE, norms, attention)
β βββ quantization/ # Quantization schemes (NF4, ternary, TurboQuant)
β βββ training/ # Training adapters (LoRA, DoRA)
β βββ utils/ # Utilities (KV cache, helpers)
βββ tests/ # Test suite, mirrors src layout
β βββ conftest.py # Shared fixtures
β βββ test_config.py
β βββ test_model.py
β βββ ...
βββ docs/ # MkDocs Material documentation
MLX-Specific ConventionsΒΆ
Bit-Axon is built on Apple's MLX framework, not PyTorch. This is a deliberate choice to run natively on Apple Silicon with minimal overhead.
Warning
Do not import or depend on PyTorch anywhere in the codebase. All tensor operations use mx.array from MLX.
Key differences from PyTorchΒΆ
Use __call__ for forward passes, not forward(). MLX modules use __call__ as the primary entry point. Avoid defining a separate forward method.
Call mx.eval() explicitly. MLX uses lazy evaluation. When you need a concrete value (for instance in tests or when returning to Python), call mx.eval() on the result.
Subclass nn.Module. All model components inherit from mlx.nn.Module. Parameters and buffers are registered by assigning mx.array attributes directly.
class MyLayer(nn.Module):
def __init__(self, dim: int):
super().__init__()
self.weight = mx.zeros((dim, dim)) # Auto-registered as parameter
def __call__(self, x: mx.array) -> mx.array:
return x @ self.weight
Info
When writing new layers or model components, always use mx.array for all tensor types. This includes parameters, activations, and any intermediate computations. Never reference torch.Tensor or jax.Array.
Code StyleΒΆ
We use Ruff for both linting and formatting. No other linters or formatters are needed.
FormattingΒΆ
Ruff enforces these rules (configured in pyproject.toml):
- Line length: 160 characters
- Double quotes for all strings
- Spaces for indentation (no tabs)
- Target Python version: 3.10
LintingΒΆ
Enabled rule sets: E, W, F, I, N, UP, B, SIM, C4, DTZ, RUF. A few specific rules are intentionally ignored (E501, B008, N802, N803). First-party imports (bit_axon) are sorted to the top.
Type HintsΒΆ
Use type hints for all public function signatures and class attributes. Prefer from __future__ import annotations for forward references.
DocstringsΒΆ
Use Google-style docstrings for all public classes and functions:
def compute_attention_scores(query: mx.array, key: mx.array, scale: float) -> mx.array:
"""Compute scaled dot-product attention scores.
Args:
query: Query tensor of shape (batch, heads, seq_len, dim).
key: Key tensor of shape (batch, heads, seq_len, dim).
scale: Scaling factor applied before softmax.
Returns:
Attention weights of shape (batch, heads, seq_len, seq_len).
"""
Import OrderΒΆ
Ruff's I rules handle this automatically. First-party (bit_axon) imports come before third-party imports.
Tip
Run ruff check src/ tests/ --fix and ruff format src/ tests/ before committing. The pre-commit hooks do this too, but running them locally lets you fix issues faster.
TestingΒΆ
Tests live in tests/ and mirror the src/bit_axon/ directory structure. If you add a new module at src/bit_axon/layers/my_layer.py, its tests go in tests/layers/test_my_layer.py. Shared fixtures go in tests/conftest.py.
# Run everything
pytest tests/
# Run in parallel (uses all cores)
pytest -n auto tests/
# Run a specific test file
pytest tests/test_model.py
# Run tests matching a name pattern
pytest -k TestAxonSSM
Tip
When writing new tests, follow the existing patterns in the test suite. If you add shared fixtures, put them in tests/conftest.py so they're available across all test modules.
Note
All test files must be discoverable by pytest. Make sure files are named test_*.py and test functions are named test_* or are methods inside classes named Test*.
CLI DevelopmentΒΆ
The CLI is built with Typer and Rich.
Adding a new CLI commandΒΆ
- Create
src/bit_axon/cli/<command>.pywith a function implementing the command logic - Register in
src/bit_axon/cli/main.pyusing@app.command():
@app.command()
def mycommand(...):
from bit_axon.cli.mycommand import mycommand_impl
mycommand_impl(...)
- Add tests in
tests/cli/test_<command>.pyusingtyper.testing.CliRunner - All imports must be lazy (inside functions) to avoid loading MLX for
--help
CLI conventionsΒΆ
- Use
--config-smallflag for testing without real models - Use Rich console for output (spinners, progress bars, tables)
- Lazy import all
bit_axonmodules inside command functions
Warning
Never import MLX or bit_axon model code at the module level in CLI files. All imports of MLX-dependent code must be lazy (inside function bodies) so that bit-axon --help works without MLX installed.
Documentation ContributionsΒΆ
The documentation is built with MkDocs Material and lives in the docs/ directory.
Previewing locallyΒΆ
This starts a local dev server at http://localhost:8000. Changes to markdown files are reflected immediately on reload.
Editing guidelinesΒΆ
- Write in English. Do not write Korean in the source markdown files.
- Use Material admonitions (
!!! tip,!!! warning,!!! note,!!! info) for callouts. - Use proper heading hierarchy. Don't skip levels (for example, don't jump from
##to####). - Add code blocks with language tags (
```python,```bash, etc.).
Internationalization (Korean translations)ΒΆ
Bit-Axon supports Korean documentation alongside English. The i18n workflow uses a file suffix convention:
- English source:
docs/some-page/index.md - Korean translation:
docs/some-page/index.ko.md
To add or update a Korean translation, create or edit the .ko.md file alongside the English source. The build system picks up both versions automatically.
Tip
Only edit the English source files directly. Korean translations go in the matching .ko.md files. Never mix languages within a single file.
Commit MessagesΒΆ
We follow Conventional Commits. The format:
Types: feat, fix, docs, chore, test, refactor, perf
Scopes in use: layers, model, training, quantization, utils, ci
Examples from the project's history:
feat(layers): add selective scan wrapper for AxonSSM
feat(model): implement sparse MoE router with top-k gating
feat(quantization): add TurboQuant mixed-precision quantization
fix(training): correct LoRA gradient accumulation for batched inputs
docs: update README with new benchmark results
chore(ci): add parallel test execution to GitHub Actions
Info
Pre-commit hooks validate commit message format automatically. If your commit message doesn't match the conventional commits pattern, the hook will reject it with a suggestion.
Pull Request ProcessΒΆ
- Fork the repository and create a feature branch from
main. - Make your changes following the style guide above.
- Write tests for new functionality. Place test files in
tests/, mirroring thesrc/bit_axon/structure. Add shared fixtures totests/conftest.py. -
Run linting and tests locally before pushing:
-
Open a PR with a clear description of the change and motivation.
- AI-assisted contributions: If any part of your PR was generated or significantly assisted by an AI tool, please note this in the PR description. You don't need to disclose which tool or provide prompts, just flag it so reviewers are aware.
- Address review feedback and push updates to the same branch. The PR will be merged once approved.
Reporting BugsΒΆ
Found a bug? Open an issue with a clear title and include:
- A minimal reproducible example
- Your Python version, macOS version, and MLX version
- The expected vs. actual behavior
- Any relevant logs or stack traces
If you're unsure whether something is a bug, open an issue anyway. We'd rather triage it than miss it.
See alsoΒΆ
- Installation β Development environment setup
- Architecture Overview β Model design and layer structure
- API Reference β Public Python API documentation
- Code of Conduct β Community guidelines
LicenseΒΆ
Bit-Axon is released under the MIT License. By contributing, you agree that your contributions will be licensed under the same terms.