Testing Guide for Developers¶
This guide explains how testing works in NuMojo and how to add/maintain tests with consistent quality.
Goals¶
NuMojo tests should:
- validate numerical correctness against trusted references (primarily NumPy),
- catch regressions quickly,
- be easy for new contributors to run and extend,
- keep behavior consistent across modules (
core,routines,science).
Test layout¶
Current test structure:
tests/core/*— core containers, indexing, shape/stride behavior, matrix/core semanticstests/routines/*— user-facing functional routines (math, linalg, io, sorting, etc.)tests/science/*— higher-level scientific modulestests/utils_for_test.mojo— helper assertions and NumPy comparison utilities
Keep new tests in the correct bucket. If a test spans layers, place it where the user-facing behavior is asserted.
Running tests locally¶
From repo root:
```/dev/null/terminal.sh#L1-4 pixi run test pixi run test_core pixi run test_routines pixi run test_science
Run one file directly:
```/dev/null/terminal.sh#L1-1
pixi run mojo run -I tests/ tests/routines/test_math.mojo
Or use the helper task:
```/dev/null/terminal.sh#L1-1 pixi run run-test TEST_FILE=tests/routines/test_math.mojo
final runs formatting and full tests.
Test entrypoint pattern¶
Every test file should expose test functions and have the standard discovery runner:
```/dev/null/example_test_file.mojo#L1-8 from testing.testing import TestSuite
def test_example(): pass
def main(): TestSuite.discover_tests__functions_in_module().run()
Use `def test_*` naming so discovery picks tests automatically.
---
## Assertion strategy
Use explicit assertions with clear failure messages. Prefer one conceptual assertion per check block.
### Numerical array checks
Use helpers from `tests/utils_for_test.mojo`:
- `check(...)` for exact equality against NumPy
- `check_with_dtype(...)` for values + dtype agreement
- `check_is_close(...)` for approximate float checks
- `check_values_close(...)` for scalar tolerance checks
Example pattern:
```/dev/null/example_check.mojo#L1-14
from python import Python
import numojo as nm
from utils_for_test import check_is_close
def test_sin_basic() raises:
var np = Python.import_module("numpy")
var a = nm.linspace[f32](0, 3.14159, 20)
var got = nm.sin(a)
var exp = np.sin(a.to_numpy())
check_is_close(got, exp, "sin should match numpy within tolerance")
What to test for every new function¶
When adding a new API function, include tests for:
- Happy path
- standard shape(s)
- common dtype(s)
- Edge cases
- small sizes (
0,1) - degenerate shape when meaningful
- negative axis handling where applicable
- Error behavior
- invalid shapes
- out-of-bounds axis
- unsupported dtype
- Layout sensitivity
- contiguous and non-contiguous/view-like paths where relevant
- Parity
- compare behavior against NumPy equivalent (or clearly documented intentional deviation)
Error tests¶
For invalid inputs, assert that errors are raised and messages are meaningful.
```/dev/null/example_error_test.mojo#L1-12 from testing import assert_raises import numojo as nm from numojo.prelude import *
def test_sum_invalid_axis_raises(): var a = nm.arangef32.reshape(Shape(2, 3)) assert_raisesError ```
If exact message matching is brittle, at least assert the error type and key context.
Tolerance guidance¶
Use strict tolerances where possible:
f64: tighter (1e-8to1e-12depending on op)f32: moderate (1e-4to1e-6)- complex/math-heavy transforms: choose practical tolerance and document why
Avoid very loose tolerances unless truly necessary. If a loose tolerance is required, add a short rationale in the test message.
Determinism guidance¶
For random-based tests:
- prefer fixed inputs over random where possible,
- if randomness is needed, use deterministic generation patterns or compare invariant properties,
- avoid flaky tests caused by unstable random assumptions.
Performance-sensitive tests¶
Unit tests should prioritize correctness, not microbenchmarking.
For performance checks:
- keep benchmark-style code in
benchmark/, - avoid strict timing assertions in CI tests,
- if needed, test algorithmic behavior (e.g., output shape, monotonicity, complexity-safe constraints) rather than elapsed time.
CI expectations¶
CI runs tests by category and expects:
- formatting clean,
- all test scripts pass,
- no dependency on local-only files/paths.
Before opening a PR:
pixi run formatpixi run test- verify your new tests are discovered and executed
Common mistakes to avoid¶
- adding tests without
def main()discovery runner, - writing tests that depend on local machine state,
- comparing floats with exact equality when approximation is required,
- skipping negative/error cases,
- adding huge tests that are slow without necessity,
- placing tests in the wrong test directory.
Recommended checklist for PR authors¶
- Added tests for new behavior
- Added tests for invalid/error path
- Compared against NumPy reference when applicable
- Covered at least one edge case
- Ran
pixi run final - Ensured test file follows discovery pattern
Future improvements (tracked as follow-up work)¶
- add dedicated matrix comparison helpers (
check_matrix,check_matrix_is_close), - make tolerance configurable in helper functions,
- improve diagnostics for failed comparisons (shape/dtype diff summary),
- add a lightweight testing FAQ for common contributor issues.