32  Test Strategy and Property Testing

panproto uses a layered testing strategy: fast unit tests for individual functions, integration tests that exercise cross-crate workflows, property-based tests for algebraic invariants, and TypeScript tests for the SDK.

32.1 Test pyramid

Layer Tool Scope Run Command
Unit tests cargo test Per-module, per-crate cargo test -p panproto-schema
Integration tests cargo nextest Cross-crate workflows cargo nextest run -p panproto-integration
Property tests proptest Algebraic invariants cargo test -p panproto-mig (included in unit tests)
TypeScript tests pnpm test SDK + WASM round-trips cd sdk/typescript && pnpm test
VCS workflow tests cargo nextest Cross-module VCS workflows cargo nextest run -p panproto-vcs --test workflows
CLI binary tests assert_cmd CLI argument parsing + output cargo nextest run -p panproto-cli --test cli_workflows
Benchmarks divan Performance regression cargo bench -p panproto-schema
Tip

Use cargo nextest run instead of cargo test for the integration suite. Nextest runs each test as a separate process, which avoids thread-local state contamination from the WASM slab allocator.

32.2 Unit tests

Each crate has inline #[cfg(test)] mod tests blocks adjacent to the code they test. The convention:

  • Test modules are annotated with #[allow(clippy::unwrap_used, clippy::expect_used)] since test assertions naturally use .unwrap()
  • Helper functions for constructing test fixtures are private to the test module
  • Test names describe the property being verified: constraint_obstruction_detected, duplicate_vertex_rejected

Run a single crate’s tests:

cargo test -p panproto-schema
cargo test -p panproto-mig
cargo test -p panproto-gat

32.3 Integration tests

The tests/integration crate contains 14 integration tests that exercise end-to-end workflows. Each test file focuses on a specific scenario:

Test File Description
self_description.rs Verifies ThGAT (the theory of GATs) is itself a well-formed GAT
atproto_recursive.rs Recursive ATProto schema with nested threadViewPost projection
atproto_roundtrip.rs Parse JSON lexicon, apply identity migration, verify round-trip fidelity
sql_migration.rs SQL add-column, FK migration, and set-valued functor restrict
cross_protocol.rs ATProto-to-SQL interop via shared ThGraph sub-theory
lens_laws.rs GetPut and PutGet lens laws for identity and projection lenses
performance.rs Projection lift throughput exceeds baseline threshold
wasm_boundary.rs MessagePack serialization fidelity across the WASM boundary
breaking_change.rs ATProto lexicon breaking changes detected by diff/classify pipeline
cql_subsumption.rs CQL Sigma/Delta/Pi expressed as theory morphisms
cambria_subsumption.rs Cambria-style combinators expressed as lens compositions
hypergraph_fan.rs SQL FK as 4-ary hyperedge, column drop, fan reconstruction
theory_composition.rs colimit(ThGraph, ThConstraint) produces ThConstrainedGraph
custom_protocol.rs Define a new protocol from scratch, build schema, lift records

Run the full integration suite:

cargo nextest run -p panproto-integration

Run a single integration test:

cargo nextest run -p panproto-integration -- atproto_roundtrip

32.4 Property-based testing with proptest

Property-based tests verify algebraic invariants that must hold for all valid inputs. The proptest crate generates random inputs and shrinks failing cases to minimal counterexamples.

32.4.1 Heavily property-tested functions

The existence checker is a prime example of a function with rich algebraic properties:

pub fn check_existence(
    protocol: &Protocol,
    src: &Schema,
    tgt: &Schema,
    migration: &Migration,
    theory_registry: &HashMap<String, Theory>,
) -> ExistenceReport {

Properties tested for check_existence include:

  • Identity migration is always valid: mapping every vertex to itself with consistent edges must produce a valid report
  • Kind-preserving maps are consistent: if all mapped vertices have matching kinds, no KindInconsistency errors appear
  • Constraint monotonicity: loosening a constraint never introduces ConstraintTightened errors
  • Composition preserves validity: if migration A is valid and migration B is valid, their composition should pass the same checks

32.4.2 Writing a proptest strategy for a new Type

To property-test a new type, define a proptest strategy that generates valid instances. The pattern:

use proptest::prelude::*;

/// Strategy for generating a valid Vertex.
fn arb_vertex() -> impl Strategy<Value = Vertex> {
    // Generate an id (alphanumeric, 1-20 chars) and a kind from a fixed set.
    let id = "[a-z][a-z0-9.:-]{0,19}";
    let kind = prop_oneof![
        Just("object".to_string()),
        Just("string".to_string()),
        Just("integer".to_string()),
        Just("record".to_string()),
    ];
    (id, kind).prop_map(|(id, kind)| Vertex {
        id,
        kind,
        nsid: None,
    })
}

/// Strategy for generating a valid Schema with 1-10 vertices.
fn arb_schema() -> impl Strategy<Value = Schema> {
    prop::collection::vec(arb_vertex(), 1..10)
        .prop_flat_map(|vertices| {
            // Generate edges between existing vertices...
            // Build the schema...
        })
}

proptest! {
    #[test]
    fn identity_migration_is_valid(schema in arb_schema()) {
        let migration = build_identity_migration(&schema);
        let report = check_existence(&protocol, &schema, &schema, &migration, &registry);
        prop_assert!(report.valid, "identity migration should always be valid");
    }
}
Note

Strategies should generate structurally valid inputs (well-formed schemas with consistent vertices and edges). Testing with completely random bytes isn’t useful; the interesting properties live in the space of valid structures.

32.4.3 Shrinking

When proptest finds a failing case, it automatically shrinks the input to a minimal counterexample. For complex types like Schema, this means reducing the number of vertices and edges to the smallest set that still triggers the failure. The shrunk counterexample is printed in the test output and saved to a regression file.

32.5 TypeScript tests

The TypeScript SDK has its own test suite:

cd sdk/typescript && pnpm test

TypeScript tests exercise the SDK’s public API, including WASM round-trips. They verify:

  • Schema builder fluent API produces valid schemas
  • Migration compile + lift produces correct output
  • Lens get/put round-tripping (GetPut and PutGet laws)
  • Error classes are thrown with correct types and messages
  • WasmHandle disposal and FinalizationRegistry safety net
  • MessagePack encoding matches Rust expectations

Type checking is also verified:

cd sdk/typescript && pnpm exec tsc --noEmit

32.6 Benchmarks with divan

Per-crate benchmarks use the divan framework. Each crate with performance-sensitive code has a benches/ directory:

cargo bench -p panproto-schema    # Schema building benchmarks
cargo bench -p panproto-mig       # Migration compile + lift benchmarks
cargo bench -p panproto-inst      # Instance parsing benchmarks

divan provides:

  • Statistical analysis (mean, median, standard deviation)
  • Automatic iteration count tuning
  • Comparison between groups via #[divan::bench(args = [...])]

32.6.1 CI regression detection

The bench.yml workflow runs on every PR against main. It:

  1. Checks out both the main branch and the PR branch
  2. Runs cargo bench --workspace on both
  3. Compares results using benchmark-action/github-action-benchmark
  4. Posts a comment on the PR with a comparison table
  5. Alerts (but doesn’t fail) if any benchmark regresses by more than 120% of the baseline
Important

Benchmark results are inherently noisy on shared CI runners. A 120% threshold means the regression must be at least 20% worse than baseline to trigger an alert. If you see a spurious alert, re-run the workflow; true regressions will be consistent.

32.7 VCS workflow tests

The panproto-vcs crate has approximately 63 tests organized into 20 groups, covering the full repository lifecycle. Each group targets a specific area:

  • Repo lifecycle: init, add, commit, status, log, show
  • Branching: create, delete, force-delete, rename, list, verbose list
  • Merging: fast-forward, three-way, conflict detection, --no-commit, --ff-only, --no-ff, --squash, --abort
  • Cherry-pick: basic apply, -n (no-commit), -x (record origin), conflict handling
  • Rebase: linear replay, conflict stop, empty rebase
  • Stash: push, pop, apply, show, list, drop, clear
  • Reset: soft, mixed, hard modes
  • Blame: vertex, edge, constraint attribution
  • Bisect: convergence, single-step, boundary cases
  • GC: reachability marking, unreachable deletion, --dry-run
  • Reflog: entry creation, --all across refs
  • Tags: lightweight, annotated, force overwrite, delete
  • Compound workflows: branch-merge-rebase sequences, stash-across-checkout, amend after merge

Run the full VCS workflow suite:

cargo nextest run -p panproto-vcs --test workflows

32.8 CLI binary tests

The panproto-cli crate includes approximately 40 assert_cmd-based tests that exercise the schema binary end-to-end. These tests verify:

  • Argument parsing: correct flags are accepted, unknown flags produce errors
  • Output formatting: --oneline, --graph, --stat, --porcelain, --format produce expected output
  • Exit codes: success (0) for clean operations, non-zero for errors and conflicts
  • Error messages: user-facing diagnostics include actionable context (e.g., “remote operations are not yet supported”)
  • Remote stubs: all five remote commands (remote, push, pull, fetch, clone) exit with an error

Run the CLI test suite:

cargo nextest run -p panproto-cli --test cli_workflows
TipProperty Tests vs. Integration Tests?

Property tests are best for algebraic invariants (commutativity, associativity, round-trip laws) where the property should hold for all valid inputs. Integration tests are best for specific workflows where you need to verify that multiple crates cooperate correctly on a concrete example.

32.9 Writing tests for a new feature

When adding a new feature, add tests at multiple levels:

  1. Unit tests: in the crate where the feature lives, test individual functions
  2. Property tests: if the feature has algebraic invariants (commutativity, associativity, round-trip laws), write proptest strategies
  3. Integration test: if the feature involves multiple crates, add a test file in tests/integration/tests/
  4. TypeScript test: if the feature is exposed through the SDK, add tests in sdk/typescript/
  5. Benchmark: if the feature is on the hot path, add a divan benchmark