Auditing Non-C/C++ Targets¶
The harness is language-agnostic. C/C++ targets stay the headline case because AddressSanitizer is the highest-signal tool we have, but the same pipeline works for many other ecosystems:
- Rust, Go, Swift;
- Java, Kotlin;
- Python, Ruby, PHP;
- JavaScript / TypeScript (Node);
- Perl, R;
- any other ecosystem with an explicit
[runner]command.
This guide collects the moving parts in one place.
Decision tree¶
Does the target have a sanitizer build?
├── Yes → [sanitizer] enabled = ["asan", …]
│ crashes/ keeps memory-safety crashes
│ findings/ keeps non-crash security issues
│
└── No → [sanitizer] enabled = []
crashes/ is unused; runtime crashes auto-demoted to findings/
findings/ also keeps non-crash security issues
bin/setup-target picks a conservative default by introspecting the
source tree (Cargo.toml, go.mod, pyproject.toml,
package.json, …). For non-native ecosystems with no ASan binary
detected, that default is findings-only:
[sanitizer] enabled = [];- a starter
[runner].
Opt into race or another sanitizer by editing
output/<target>/target.toml.
What sanitizers exist per language¶
| Language | Compile-time flag | Sanitizer slug |
|---|---|---|
| C / C++ | -fsanitize=address / undefined / memory / thread |
asan, ubsan, msan, tsan |
| Rust | RUSTFLAGS="-Z sanitizer=address" (nightly) |
asan; also tsan and msan on supported targets |
| Go | go build -race |
race |
| Swift | swift run -Xswiftc -sanitize={SWIFT_SANITIZER} |
asan, ubsan, tsan |
| Java / JVM | JFR plus JNI ASan when auditing native bindings | none; use crash_patterns |
| Python | PYTHONMALLOC=malloc plus CPython-ASan for C extensions |
optional asan for native extensions |
| Node / V8 | --abort-on-uncaught-exception; native modules can link ASan |
optional asan for native add-ons |
| Everything else | None; findings-only mode is the right choice | n/a |
When a sanitizer is available, treat it like ASan: set the
appropriate <name>_bin and enable the slug.
What target.toml looks like for each ecosystem¶
bin/setup-target seeds these automatically. Every findings-only
target has the same shape — [sanitizer] enabled = [] plus a
[runner] block naming the interpreter or driver. A Python target,
fully annotated:
target = "demo"
build_system = "python"
[sanitizer]
enabled = [] # findings-only mode
[runner]
bin = "python3"
args = ["{TESTCASE}"]
env = [
"PYTHONDEVMODE=1",
"PYTHONPATH={TARGET_ROOT}:{TARGET_ROOT}/src:{TARGET_ROOT}/lib",
]
crash_patterns = [ # seeded from the language registry
"Traceback \\(most recent call last\\):",
"MemoryError",
"RecursionError",
"SystemError",
"Fatal Python error:",
"==\\d+==ERROR: AddressSanitizer",
]
The other ecosystems differ only in the [runner] fields:
| Ecosystem | build_system |
bin |
args |
Notable env |
|---|---|---|---|---|
| Python | python |
python3 |
["{TESTCASE}"] |
PYTHONDEVMODE=1 |
| Go | go |
go |
["run", "{TESTCASE}"] |
GORACE=halt_on_error=1 |
| Rust | cargo |
cargo |
["run", "--quiet", "--manifest-path", "{TARGET_ROOT}/Cargo.toml", "--", "{TESTCASE}"] |
— |
| Swift | swift |
swift |
["run", "--quiet", "-c", "release", "-Xswiftc", "-sanitize={SWIFT_SANITIZER}", "-Xswiftc", "-O", "--package-path", "{TARGET_ROOT}", "{TARGET_SLUG}", "{TESTCASE}"] |
— |
| Ruby | bundler |
ruby |
["{TESTCASE}"] |
— |
| Java / JVM | maven or gradle |
java |
["{TESTCASE}"] |
— |
| Kotlin | kotlin |
kotlinc |
["-script", "{TESTCASE}"] |
— |
| Node | npm |
node |
["{TESTCASE}"] |
— |
| PHP | composer |
php |
["{TESTCASE}"] |
— |
The same shape applies to rlang and perl; bin/setup-target
writes a starter [runner] block for each.
A few ecosystem notes:
- Go seeds findings-only
go run. To use the runtime race detector, set[sanitizer] enabled = ["race"]andargs = ["run", "-race", "{TESTCASE}"]. - Rust can opt into a sanitizer build later — set
[sanitizer] enabled = ["asan"]once you have a nightly+sanitizer build. - Java — single-file Java is supported (JEP 330):
java <file.java>compiles and runs in one shot. This is the seeded default. When seeding,bin/setup-targetprefers a working JDK fromAUDIT_JAVA_HOMEorJAVA_HOME, then a workingjavaonPATH. - Kotlin — the seeded default is for script-style
.ktsprobes. Plain.ktsidecar harnesses compile throughkotlinc -include-runtime. Gradle-driven Kotlin apps should keep the generatedgradlebuild system and runner.
Crash vs finding routing¶
Once the runtime is wired up, the triager decides where each artifact lands:
Signal in asan.txt |
Sanitizer enabled? | Destination |
|---|---|---|
ERROR: AddressSanitizer: ... |
yes / no | crashes/CRASH-* |
WARNING: ThreadSanitizer: data race |
yes (tsan) |
crashes/CRASH-* |
WARNING: MemorySanitizer: ... |
yes (msan) |
crashes/CRASH-* |
WARNING: DATA RACE (Go runtime) |
yes (race) |
crashes/CRASH-* |
| Python traceback | no | demoted to findings/FIND-* |
Go panic: runtime error: |
no | demoted to findings/FIND-* |
Java Exception in thread "main" |
no | demoted to findings/FIND-* |
| Node allocation fatal error | no | demoted to findings/FIND-* |
Rust thread 'main' panicked at |
no | demoted to findings/FIND-* |
PHP PHP Fatal error: |
no | demoted to findings/FIND-* |
| None of the above | n/a | crashes-rejected/ |
When a target has a sanitizer enabled
([sanitizer] enabled = ["asan", …]) but a particular crash
directory does not have a sanitizer signal, it goes to
crashes-rejected/. Demote-to-findings is reserved for the
[sanitizer] enabled = [] case, where the lack of an ASan trace is
expected.
Writing harnesses in non-C/C++ languages¶
Use // HARNESS: (or # HARNESS: for languages whose comment
delimiter is #). "Permissive" here means bin/probe ignores
everything before the HARNESS: label as long as it contains no
letters — so any comment syntax works: //, #, ;, --,
<!-- … -->, or /* … */. The same rule applies to the other
header fields (TARGET:, HYPOTHESIS-ID:, CATEGORY:, MODE:).
The file extension picks the build/interpret path.
The supported set is the registry in lib/languages.py; run
python3 lib/languages.py list for the authoritative table (one row per
language, with its harness extensions and build systems). The harness
extensions split into two buckets:
# Compiled (cached binary): .c .cc .cpp .cxx .C .go .kt .rs .swift
# Interpreted (no build step): .py .rb .pl .php .js .mjs .ts .tsx
# .java .kts .r .R .sh .bash
Crash patterns¶
If your target has a project-specific runtime banner (for example,
[BUG] in a custom panic handler, ASSERTION FAILED: from a debug
build), add it under [runner].crash_patterns:
[runner]
bin = "python3"
args = ["{TESTCASE}"]
crash_patterns = [
"^Internal compiler error:",
"^=== ABORT ===",
]
These patterns layer on top of the built-in language-agnostic
markers (Traceback, panic:, Exception in thread, …) that
lib/triage.sh already recognises.
reproduce.sh templates¶
bin/export-repro emits a runnable reproduce.sh for every
language with a build_system entry. The maintainer runs
./reproduce.sh /path/to/upstream-src and the script:
- Clones or checks out upstream at the pinned revision.
- Runs the language's canonical build step (
cargo build,go build,npm install,mvn package, …). - Invokes the captured testcase via the recorded runner.
If the language has no compile step (Python, Ruby, …), step 2 is a no-op or a virtual-env / dependency install.
See also¶
- Target config reference — the full
target.tomlschema. - Configure a target — the operator review workflow.
AGENTS.md(repository root) — the agent-facing audit workflow, covering both browser and generic targets.