Your tests pass.
But do they test?
Mutineer mutates your source one change at a time, runs your suite against each mutant, and reports the ones your tests failed to catch — the gaps where coverage is a green lie. Built to gate CI and to give AI coding agents an objective signal of test value.
$ mutineer run lib/calculator.rb --test test/calculator_test.rb Mutineer — Mutation Results ========================= total 26 killed 24 survived 1 no-coverage 1 mutation score: 92.3% ── survivor ────────────────────────────────────── Calculator#discount lib/calculator.rb:14 (comparison) --- a/lib/calculator.rb +++ b/lib/calculator.rb - return 0 if qty >= 10 + return 0 if qty > 10 ↳ no test asserts the boundary at exactly 10
Mutate. Run. Report.
A mutant that your suite still passes is a test that isn't testing anything. Mutineer finds them by breaking your code on purpose.
Mutate
Mutineer parses your source with Prism and applies one change per mutant — flip a < to <=, swap && for ||, delete a statement. Each mutant is re-parsed to confirm it's still valid Ruby.
Run
Every mutant runs your suite in a forked, isolated process — across all your cores. A coverage map means each mutant only runs the test files that actually reach it.
Report
Killed = your tests caught it. Survived = they didn't. You get a mutation score and a unified diff for every survivor, pinned to the exact line. Set a --threshold to fail CI.
A test-value oracle for your pipeline.
Line coverage tells you which code ran under test. It can't tell you whether a test would notice if that code broke — exactly where AI-generated tests are weakest. Mutineer answers that, programmatically: versioned JSON, stable mutant ids, structured exit codes, and diff-scoped runs.
Close the loop on test quality
An agent writes code and tests, then runs Mutineer on just the diff. Each survivor ships a ready-made diff to feed back: "write a test that fails under this change." Stop when survivors hit zero.
--since origin/main --format json
Fail only when a PR makes tests worse
Diff the run against a stored baseline by stable id. Exit 1 on any new survivor or a score drop — adopt mutation testing on a legacy suite without fixing everything first.
--baseline prior.json --threshold 90
One step in your workflow
A composite action wraps the CLI: point it at your sources, a baseline, and a threshold. Reads the diff, posts the report, sets the exit code.
uses: davidteren/mutineer@main
Built lean, on purpose.
No agent, no boot hooks, no monkey-patching your test framework. Just Ruby's own parser and standard library.
Zero runtime deps
Prism ships with Ruby ≥ 3.4 and everything else is stdlib. Nothing to pin, nothing to break on upgrade.
Valid mutants only
Every mutation is re-parsed before it runs. Syntactically broken mutants are skipped, not counted against your score.
Fork isolation
Each mutant runs in its own forked process on Linux & macOS — no state bleed between runs, parallel by default.
Coverage-mapped
A per-mutant coverage map runs only the test files that reach the mutated line. Digest-keyed cache skips unchanged work.
CI & agent ready
Stable, sorted --format json with a versioned schema and stable mutant ids. --threshold and --baseline gate the build.
Rails-aware
--rails boots config/environment once, then forks per mutant with fork-safe ActiveRecord — dogfooded against a real Rails app.
Eight ways to break your code.
Five run by default. Three Tier-2 operators are off until you ask for them with --operators or .mutineer.yml — they're noisier, but they catch deeper gaps.
| Operator | Mutation rule | Tier |
|---|---|---|
| comparison | < ↔ <= , > ↔ >= , == ↔ != | default |
| arithmetic | + ↔ − , * ↔ / , % → * , ** → * | default |
| boolean_connector | && ↔ || | default |
| boolean_literal | true ↔ false , nil → true | default |
| statement_removal | replace a non-final statement with nil | default |
| return_nil | replace a return / final expression with nil | tier 2 |
| literal_mutation | integer → 0, 1, n+1 ; string → empty | tier 2 |
| condition_negation | wrap if/unless/ternary condition in !( … ) | tier 2 |
Run mutineer --list-operators to see the live set for your installed version.
Install & run.
Ruby ≥ 3.4 is the only requirement. Point Mutineer at your source and the tests that cover it.
Install
Run against a file and its test
Gate a PR against a baseline
Preview mutations without running tests
Configure once in .mutineer.yml
Key flags: --operators, --threshold, --baseline, --since, --rails, --only, --jobs, --format human|json, --output FILE, --dry-run. Typed flags override .mutineer.yml.