32 Test Strategy and Property Testing

panproto uses a layered testing strategy: fast unit tests for individual functions, integration tests that exercise cross-crate workflows, property-based tests for algebraic invariants, and TypeScript tests for the SDK.

32.1 Test pyramid

Layer	Tool	Scope	Run Command
Unit tests	`cargo test`	Per-module, per-crate	`cargo test -p panproto-schema`
Integration tests	`cargo nextest`	Cross-crate workflows	`cargo nextest run -p panproto-integration`
Property tests	`proptest`	Algebraic invariants	`cargo test -p panproto-mig` (included in unit tests)
TypeScript tests	`pnpm test`	SDK + WASM round-trips	`cd sdk/typescript && pnpm test`
VCS workflow tests	`cargo nextest`	Cross-module VCS workflows	`cargo nextest run -p panproto-vcs --test workflows`
CLI binary tests	`assert_cmd`	CLI argument parsing + output	`cargo nextest run -p panproto-cli --test cli_workflows`
Benchmarks	`divan`	Performance regression	`cargo bench -p panproto-schema`

Tip

Use cargo nextest run instead of cargo test for the integration suite. Nextest runs each test as a separate process, which avoids thread-local state contamination from the WASM slab allocator.

32.2 Unit tests

Each crate has inline #[cfg(test)] mod tests blocks adjacent to the code they test. The convention:

Test modules are annotated with #[allow(clippy::unwrap_used, clippy::expect_used)] since test assertions naturally use .unwrap()
Helper functions for constructing test fixtures are private to the test module
Test names describe the property being verified: constraint_obstruction_detected, duplicate_vertex_rejected

Run a single crate’s tests:

cargo test -p panproto-schema
cargo test -p panproto-mig
cargo test -p panproto-gat

32.3 Integration tests

The tests/integration crate contains 14 integration tests that exercise end-to-end workflows. Each test file focuses on a specific scenario:

Test File	Description
`self_description.rs`	Verifies ThGAT (the theory of GATs) is itself a well-formed GAT
`atproto_recursive.rs`	Recursive ATProto schema with nested threadViewPost projection
`atproto_roundtrip.rs`	Parse JSON lexicon, apply identity migration, verify round-trip fidelity
`sql_migration.rs`	SQL add-column, FK migration, and set-valued functor restrict
`cross_protocol.rs`	ATProto-to-SQL interop via shared ThGraph sub-theory
`lens_laws.rs`	GetPut and PutGet lens laws for identity and projection lenses
`performance.rs`	Projection lift throughput exceeds baseline threshold
`wasm_boundary.rs`	MessagePack serialization fidelity across the WASM boundary
`breaking_change.rs`	ATProto lexicon breaking changes detected by diff/classify pipeline
`cql_subsumption.rs`	CQL Sigma/Delta/Pi expressed as theory morphisms
`cambria_subsumption.rs`	Cambria-style combinators expressed as lens compositions
`hypergraph_fan.rs`	SQL FK as 4-ary hyperedge, column drop, fan reconstruction
`theory_composition.rs`	colimit(ThGraph, ThConstraint) produces ThConstrainedGraph
`custom_protocol.rs`	Define a new protocol from scratch, build schema, lift records

Run the full integration suite:

cargo nextest run -p panproto-integration

Run a single integration test:

cargo nextest run -p panproto-integration -- atproto_roundtrip

32.4 Property-based testing with `proptest`

Property-based tests verify algebraic invariants that must hold for all valid inputs. The proptest crate generates random inputs and shrinks failing cases to minimal counterexamples.

32.4.1 Heavily property-tested functions

The existence checker is a prime example of a function with rich algebraic properties:

pub fn check_existence(
    protocol: &Protocol,
    src: &Schema,
    tgt: &Schema,
    migration: &Migration,
    theory_registry: &HashMap<String, Theory>,
) -> ExistenceReport {

Properties tested for check_existence include:

Identity migration is always valid: mapping every vertex to itself with consistent edges must produce a valid report
Kind-preserving maps are consistent: if all mapped vertices have matching kinds, no KindInconsistency errors appear
Constraint monotonicity: loosening a constraint never introduces ConstraintTightened errors
Composition preserves validity: if migration A is valid and migration B is valid, their composition should pass the same checks

32.4.2 Writing a `proptest` strategy for a new Type

To property-test a new type, define a proptest strategy that generates valid instances. The pattern:

use proptest::prelude::*;

/// Strategy for generating a valid Vertex.
fn arb_vertex() -> impl Strategy<Value = Vertex> {
    // Generate an id (alphanumeric, 1-20 chars) and a kind from a fixed set.
    let id = "[a-z][a-z0-9.:-]{0,19}";
    let kind = prop_oneof![
        Just("object".to_string()),
        Just("string".to_string()),
        Just("integer".to_string()),
        Just("record".to_string()),
    ];
    (id, kind).prop_map(|(id, kind)| Vertex {
        id,
        kind,
        nsid: None,
    })
}

/// Strategy for generating a valid Schema with 1-10 vertices.
fn arb_schema() -> impl Strategy<Value = Schema> {
    prop::collection::vec(arb_vertex(), 1..10)
        .prop_flat_map(|vertices| {
            // Generate edges between existing vertices...
            // Build the schema...
        })
}

proptest! {
    #[test]
    fn identity_migration_is_valid(schema in arb_schema()) {
        let migration = build_identity_migration(&schema);
        let report = check_existence(&protocol, &schema, &schema, &migration, &registry);
        prop_assert!(report.valid, "identity migration should always be valid");
    }
}

Note

Strategies should generate structurally valid inputs (well-formed schemas with consistent vertices and edges). Testing with completely random bytes isn’t useful; the interesting properties live in the space of valid structures.

32.4.3 Shrinking

When proptest finds a failing case, it automatically shrinks the input to a minimal counterexample. For complex types like Schema, this means reducing the number of vertices and edges to the smallest set that still triggers the failure. The shrunk counterexample is printed in the test output and saved to a regression file.

32.5 TypeScript tests

The TypeScript SDK has its own test suite:

cd sdk/typescript && pnpm test

TypeScript tests exercise the SDK’s public API, including WASM round-trips. They verify:

Schema builder fluent API produces valid schemas
Migration compile + lift produces correct output
Lens get/put round-tripping (GetPut and PutGet laws)
Error classes are thrown with correct types and messages
WasmHandle disposal and FinalizationRegistry safety net
MessagePack encoding matches Rust expectations

Type checking is also verified:

cd sdk/typescript && pnpm exec tsc --noEmit

32.6 Benchmarks with `divan`

Per-crate benchmarks use the divan framework. Each crate with performance-sensitive code has a benches/ directory:

cargo bench -p panproto-schema    # Schema building benchmarks
cargo bench -p panproto-mig       # Migration compile + lift benchmarks
cargo bench -p panproto-inst      # Instance parsing benchmarks

divan provides:

Statistical analysis (mean, median, standard deviation)
Automatic iteration count tuning
Comparison between groups via #[divan::bench(args = [...])]

32.6.1 CI regression detection

The bench.yml workflow runs on every PR against main. It:

Checks out both the main branch and the PR branch
Runs cargo bench --workspace on both
Compares results using benchmark-action/github-action-benchmark
Posts a comment on the PR with a comparison table
Alerts (but doesn’t fail) if any benchmark regresses by more than 120% of the baseline

Important

Benchmark results are inherently noisy on shared CI runners. A 120% threshold means the regression must be at least 20% worse than baseline to trigger an alert. If you see a spurious alert, re-run the workflow; true regressions will be consistent.

32.7 VCS workflow tests

The panproto-vcs crate has approximately 63 tests organized into 20 groups, covering the full repository lifecycle. Each group targets a specific area:

Repo lifecycle: init, add, commit, status, log, show
Branching: create, delete, force-delete, rename, list, verbose list
Merging: fast-forward, three-way, conflict detection, --no-commit, --ff-only, --no-ff, --squash, --abort
Cherry-pick: basic apply, -n (no-commit), -x (record origin), conflict handling
Rebase: linear replay, conflict stop, empty rebase
Stash: push, pop, apply, show, list, drop, clear
Reset: soft, mixed, hard modes
Blame: vertex, edge, constraint attribution
Bisect: convergence, single-step, boundary cases
GC: reachability marking, unreachable deletion, --dry-run
Reflog: entry creation, --all across refs
Tags: lightweight, annotated, force overwrite, delete
Compound workflows: branch-merge-rebase sequences, stash-across-checkout, amend after merge

Run the full VCS workflow suite:

cargo nextest run -p panproto-vcs --test workflows

32.8 CLI binary tests

The panproto-cli crate includes approximately 40 assert_cmd-based tests that exercise the schema binary end-to-end. These tests verify:

Argument parsing: correct flags are accepted, unknown flags produce errors
Output formatting: --oneline, --graph, --stat, --porcelain, --format produce expected output
Exit codes: success (0) for clean operations, non-zero for errors and conflicts
Error messages: user-facing diagnostics include actionable context (e.g., “remote operations are not yet supported”)
Remote stubs: all five remote commands (remote, push, pull, fetch, clone) exit with an error

Run the CLI test suite:

cargo nextest run -p panproto-cli --test cli_workflows

Property Tests vs. Integration Tests?

Property tests are best for algebraic invariants (commutativity, associativity, round-trip laws) where the property should hold for all valid inputs. Integration tests are best for specific workflows where you need to verify that multiple crates cooperate correctly on a concrete example.

32.9 Writing tests for a new feature

When adding a new feature, add tests at multiple levels:

Unit tests: in the crate where the feature lives, test individual functions
Property tests: if the feature has algebraic invariants (commutativity, associativity, round-trip laws), write proptest strategies
Integration test: if the feature involves multiple crates, add a test file in tests/integration/tests/
TypeScript test: if the feature is exposed through the SDK, add tests in sdk/typescript/
Benchmark: if the feature is on the hot path, add a divan benchmark

# Test Strategy and Property Testing {#sec-testing} panproto uses a layered testing strategy: fast unit tests for individual functions, integration tests that exercise cross-crate workflows, property-based tests for algebraic invariants, and TypeScript tests for the SDK. ## Test pyramid | Layer | Tool | Scope | Run Command | |---|---|---|---| | Unit tests | `cargo test` | Per-module, per-crate | `cargo test -p panproto-schema` | | Integration tests | `cargo nextest` | Cross-crate workflows | `cargo nextest run -p panproto-integration` | | Property tests | `proptest` | Algebraic invariants | `cargo test -p panproto-mig` (included in unit tests) | | TypeScript tests | `pnpm test` | SDK + WASM round-trips | `cd sdk/typescript && pnpm test` | | VCS workflow tests | `cargo nextest` | Cross-module VCS workflows | `cargo nextest run -p panproto-vcs --test workflows` | | CLI binary tests | `assert_cmd` | CLI argument parsing + output | `cargo nextest run -p panproto-cli --test cli_workflows` | | Benchmarks | `divan` | Performance regression | `cargo bench -p panproto-schema` | ::: {.callout-tip} Use `cargo nextest run` instead of `cargo test` for the integration suite. Nextest runs each test as a separate process, which avoids thread-local state contamination from the WASM slab allocator. ::: ## Unit tests Each crate has inline `#[cfg(test)] mod tests` blocks adjacent to the code they test. The convention: - Test modules are annotated with `#[allow(clippy::unwrap_used, clippy::expect_used)]` since test assertions naturally use `.unwrap()` - Helper functions for constructing test fixtures are private to the test module - Test names describe the property being verified: `constraint_obstruction_detected`, `duplicate_vertex_rejected` Run a single crate's tests: ```bash cargo test -p panproto-schema cargo test -p panproto-mig cargo test -p panproto-gat ``` ## Integration tests The `tests/integration` crate contains 14 integration tests that exercise end-to-end workflows. Each test file focuses on a specific scenario: | Test File | Description | |---|---| | `self_description.rs` | Verifies ThGAT (the theory of GATs) is itself a well-formed GAT | | `atproto_recursive.rs` | Recursive ATProto schema with nested threadViewPost projection | | `atproto_roundtrip.rs` | Parse JSON lexicon, apply identity migration, verify round-trip fidelity | | `sql_migration.rs` | SQL add-column, FK migration, and set-valued functor restrict | | `cross_protocol.rs` | ATProto-to-SQL interop via shared ThGraph sub-theory | | `lens_laws.rs` | GetPut and PutGet lens laws for identity and projection lenses | | `performance.rs` | Projection lift throughput exceeds baseline threshold | | `wasm_boundary.rs` | MessagePack serialization fidelity across the WASM boundary | | `breaking_change.rs` | ATProto lexicon breaking changes detected by diff/classify pipeline | | `cql_subsumption.rs` | CQL Sigma/Delta/Pi expressed as theory morphisms | | `cambria_subsumption.rs` | Cambria-style combinators expressed as lens compositions | | `hypergraph_fan.rs` | SQL FK as 4-ary hyperedge, column drop, fan reconstruction | | `theory_composition.rs` | colimit(ThGraph, ThConstraint) produces ThConstrainedGraph | | `custom_protocol.rs` | Define a new protocol from scratch, build schema, lift records | Run the full integration suite: ```bash cargo nextest run -p panproto-integration ``` Run a single integration test: ```bash cargo nextest run -p panproto-integration -- atproto_roundtrip ``` ## Property-based testing with `proptest` Property-based tests verify algebraic invariants that must hold for all valid inputs. The `proptest` crate generates random inputs and shrinks failing cases to minimal counterexamples. ### Heavily property-tested functions The existence checker is a prime example of a function with rich algebraic properties: ```{.rust include="../../crates/panproto-mig/src/existence.rs" start-line=38 end-line=44} ``` Properties tested for `check_existence` include: - **Identity migration is always valid**: mapping every vertex to itself with consistent edges must produce a valid report - **Kind-preserving maps are consistent**: if all mapped vertices have matching kinds, no `KindInconsistency` errors appear - **Constraint monotonicity**: loosening a constraint never introduces `ConstraintTightened` errors - **Composition preserves validity**: if migration A is valid and migration B is valid, their composition should pass the same checks ### Writing a `proptest` strategy for a new Type To property-test a new type, define a `proptest` strategy that generates valid instances. The pattern: ```rust use proptest::prelude::*; /// Strategy for generating a valid Vertex. fn arb_vertex() -> impl Strategy<Value = Vertex> { // Generate an id (alphanumeric, 1-20 chars) and a kind from a fixed set. let id = "[a-z][a-z0-9.:-]{0,19}"; let kind = prop_oneof![ Just("object".to_string()), Just("string".to_string()), Just("integer".to_string()), Just("record".to_string()), ]; (id, kind).prop_map(|(id, kind)| Vertex { id, kind, nsid: None, }) } /// Strategy for generating a valid Schema with 1-10 vertices. fn arb_schema() -> impl Strategy<Value = Schema> { prop::collection::vec(arb_vertex(), 1..10) .prop_flat_map(|vertices| { // Generate edges between existing vertices... // Build the schema... }) } proptest! { #[test] fn identity_migration_is_valid(schema in arb_schema()) { let migration = build_identity_migration(&schema); let report = check_existence(&protocol, &schema, &schema, &migration, &registry); prop_assert!(report.valid, "identity migration should always be valid"); } } ``` ::: {.callout-note} Strategies should generate structurally valid inputs (well-formed schemas with consistent vertices and edges). Testing with completely random bytes isn't useful; the interesting properties live in the space of valid structures. ::: ### Shrinking When `proptest` finds a failing case, it automatically shrinks the input to a minimal counterexample. For complex types like `Schema`, this means reducing the number of vertices and edges to the smallest set that still triggers the failure. The shrunk counterexample is printed in the test output and saved to a regression file. ## TypeScript tests The TypeScript SDK has its own test suite: ```bash cd sdk/typescript && pnpm test ``` TypeScript tests exercise the SDK's public API, including WASM round-trips. They verify: - Schema builder fluent API produces valid schemas - Migration compile + lift produces correct output - Lens get/put round-tripping (GetPut and PutGet laws) - Error classes are thrown with correct types and messages - `WasmHandle` disposal and `FinalizationRegistry` safety net - MessagePack encoding matches Rust expectations Type checking is also verified: ```bash cd sdk/typescript && pnpm exec tsc --noEmit ``` ## Benchmarks with `divan` Per-crate benchmarks use the `divan` framework. Each crate with performance-sensitive code has a `benches/` directory: ```bash cargo bench -p panproto-schema # Schema building benchmarks cargo bench -p panproto-mig # Migration compile + lift benchmarks cargo bench -p panproto-inst # Instance parsing benchmarks ``` `divan` provides: - Statistical analysis (mean, median, standard deviation) - Automatic iteration count tuning - Comparison between groups via `#[divan::bench(args = [...])]` ### CI regression detection The `bench.yml` workflow runs on every PR against `main`. It: 1. Checks out both the `main` branch and the PR branch 2. Runs `cargo bench --workspace` on both 3. Compares results using `benchmark-action/github-action-benchmark` 4. Posts a comment on the PR with a comparison table 5. Alerts (but doesn't fail) if any benchmark regresses by more than 120% of the baseline ::: {.callout-important} Benchmark results are inherently noisy on shared CI runners. A 120% threshold means the regression must be at least 20% worse than baseline to trigger an alert. If you see a spurious alert, re-run the workflow; true regressions will be consistent. ::: ## VCS workflow tests The `panproto-vcs` crate has approximately 63 tests organized into 20 groups, covering the full repository lifecycle. Each group targets a specific area: - **Repo lifecycle**: init, add, commit, status, log, show - **Branching**: create, delete, force-delete, rename, list, verbose list - **Merging**: fast-forward, three-way, conflict detection, `--no-commit`, `--ff-only`, `--no-ff`, `--squash`, `--abort` - **Cherry-pick**: basic apply, `-n` (no-commit), `-x` (record origin), conflict handling - **Rebase**: linear replay, conflict stop, empty rebase - **Stash**: push, pop, apply, show, list, drop, clear - **Reset**: soft, mixed, hard modes - **Blame**: vertex, edge, constraint attribution - **Bisect**: convergence, single-step, boundary cases - **GC**: reachability marking, unreachable deletion, `--dry-run` - **Reflog**: entry creation, `--all` across refs - **Tags**: lightweight, annotated, force overwrite, delete - **Compound workflows**: branch-merge-rebase sequences, stash-across-checkout, amend after merge Run the full VCS workflow suite: ```bash cargo nextest run -p panproto-vcs --test workflows ``` ## CLI binary tests The `panproto-cli` crate includes approximately 40 `assert_cmd`-based tests that exercise the `schema` binary end-to-end. These tests verify: - **Argument parsing**: correct flags are accepted, unknown flags produce errors - **Output formatting**: `--oneline`, `--graph`, `--stat`, `--porcelain`, `--format` produce expected output - **Exit codes**: success (0) for clean operations, non-zero for errors and conflicts - **Error messages**: user-facing diagnostics include actionable context (e.g., "remote operations are not yet supported") - **Remote stubs**: all five remote commands (`remote`, `push`, `pull`, `fetch`, `clone`) exit with an error Run the CLI test suite: ```bash cargo nextest run -p panproto-cli --test cli_workflows ``` ::: {.callout-tip} ## Property Tests vs. Integration Tests? Property tests are best for algebraic invariants (commutativity, associativity, round-trip laws) where the property should hold for *all* valid inputs. Integration tests are best for specific workflows where you need to verify that multiple crates cooperate correctly on a concrete example. ::: ## Writing tests for a new feature When adding a new feature, add tests at multiple levels: 1. **Unit tests**: in the crate where the feature lives, test individual functions 2. **Property tests**: if the feature has algebraic invariants (commutativity, associativity, round-trip laws), write `proptest` strategies 3. **Integration test**: if the feature involves multiple crates, add a test file in `tests/integration/tests/` 4. **TypeScript test**: if the feature is exposed through the SDK, add tests in `sdk/typescript/` 5. **Benchmark**: if the feature is on the hot path, add a `divan` benchmark

32.1 Test pyramid

32.2 Unit tests

32.3 Integration tests

32.4 Property-based testing with proptest

32.4.1 Heavily property-tested functions

32.4.2 Writing a proptest strategy for a new Type

32.4.3 Shrinking

32.5 TypeScript tests

32.6 Benchmarks with divan

32.6.1 CI regression detection

32.7 VCS workflow tests

32.8 CLI binary tests

32.9 Writing tests for a new feature

32.4 Property-based testing with `proptest`

32.4.2 Writing a `proptest` strategy for a new Type

32.6 Benchmarks with `divan`