10 Schema Version Control
You already use version control for code. panproto extends that idea to schemas themselves: every schema, migration, and commit gets a blake3 hash based on its structure. Two schemas with the same shape produce the same hash, even if created by different teams. Identity comes from content, not names.
Each commit stores a schema graph \(G\), the morphism \(f: G_{\mathrm{parent}} \to G\) from its parent, and the complement \(C\) for backward migration.
10.1 Thinking like git
panproto’s version control works on a familiar vocabulary:
| Git concept | panproto equivalent |
|---|---|
| Blob (file content) | Schema (graph structure) |
| Diff/patch | Migration (graph morphism \(f: G_1 \to G_2\)) |
| Three-way merge | Structural merge (pushout of schemas) |
| Content-addressing (SHA) | Content-addressing (blake3) |
| Commit DAG | Schema evolution DAG |
| Branch | Named pointer to a schema version |
| Conflict | Structural incompatibility |
The key difference: git merges text and hopes the result compiles. panproto merges structure and guarantees data integrity.
10.2 The object store
Storage mirrors git’s layout.
.panproto/
objects/<hex[0..2]>/<hex[2..]> # content-addressed objects
refs/heads/main # branch pointers
refs/tags/v1.0 # tag pointers
HEAD # current branch
logs/ # reflog (audit trail)
10.3 Core commands
The CLI borrows git’s vocabulary where concepts align.
10.3.1 schema init
Creates a .panproto/ directory with the object store, refs, and HEAD pointing to main. Unlike git, there is no working tree; the repository tracks one schema at a time.
schema init
schema init my-project
schema init -b develop # choose a different initial branch name10.3.2 schema add
Stages a schema for the next commit. The command loads the schema JSON, computes a structural diff against HEAD’s schema, auto-derives a migration (the graph morphism), and validates it through the existence checker.
As of v0.6, schema add also type-checks all equations in the schema’s theory and verifies that the auto-derived migration preserves them. Invalid schemas are caught before they enter the repository.
If the diff involves vertex renames or edge contractions that can’t be auto-derived, supply an explicit migration file:
schema add schema.json
schema add schema.json --migration explicit-migration.json
schema add --dry-run schema.json # preview without writing to the index
schema add --force schema.json # stage even if validation fails10.3.3 schema commit
Creates a commit object storing the schema’s content hash, the migration’s content hash, parent commit IDs, and metadata. It advances the current branch ref and appends a reflog entry. Every commit carries enough information to migrate instance data forward or backward.
schema commit -m "add verification status field"
schema commit --amend -m "add verification status field (v2)"
schema commit --skip-verify -m "experimental: partial theory"Equation verification runs again at commit time as a final safety check. The --skip-verify flag bypasses type-checking for experimental work; use it sparingly.
10.3.4 schema diff
Computes a structural diff between two schema files. It reports added and removed vertices and edges, kind changes, constraint changes, variant changes, and ordering changes.
Git’s diff operates on lines of text. schema diff operates on the schema graph: it knows that adding a vertex differs from adding an edge, and that removing a coproduct variant is a type error while removing an optional field is lossy.
schema diff old-schema.json new-schema.json
schema diff --stat old-schema.json new-schema.json # summary statistics
schema diff --staged # diff staged against HEAD
schema diff --detect-renames old-schema.json new-schema.jsonThe --detect-renames flag uses structural similarity heuristics to identify renamed vertices and edge labels. Detected renames are shown with confidence scores.
10.3.5 schema merge
Merges a branch into the current branch by finding the merge base (lowest common ancestor in the DAG), computing diffs from the base to both tips, and applying non-conflicting changes.
Merge diverges most from git. Git operates on lines of text and produces conflict markers. panproto computes the structural merge of the two divergent schemas over their common base.1
One-sided changes are accepted. Identical changes from both sides are deduplicated. Incompatible changes produce typed conflicts: not text markers, but structured objects like BothModifiedVertex, BothModifiedConstraint, DeleteModifyVertex. Fast-forward merges work as in git.
As of v0.6, the merge algorithm uses pullback-based overlap detection to identify shared structure between divergent schemas.
schema merge feature
schema merge --verbose feature # show overlap computation details
schema merge --no-commit feature # leave the result staged
schema merge --ff-only feature # only allow fast-forward
schema merge --no-ff feature # force a merge commit
schema merge --squash feature # squash into a single commit
schema merge --abort # abort a conflicted merge
schema merge -m "merge feature into main" feature10.3.6 schema lift
Migrates instance data from one schema version to another. The command finds the path between two refs in the DAG, composes all migrations along the path into a single morphism \(f: G_1 \to G_n\), and applies the functorial lift to the record.
Git tracks file history but can’t transform file contents based on structural changes. panproto can, because migrations are graph morphisms with a well-defined action on data.
schema lift --migration mig.json --src-schema v1.json --tgt-schema v2.json record.jsonIf the DAG has two distinct paths from commit \(A\) to commit \(B\), does schema lift produce the same result on both paths? What guarantees this?
Functoriality. Migration morphisms compose associatively: if \(f: A \to B\) and \(g: B \to C\), then \(g \circ f: A \to C\) is the same regardless of how the composition is parenthesized. The DAG’s commutativity constraint ensures that for any two paths from \(A\) to \(C\), the composed morphisms are equal. This is a coherence condition on the category of schemas.
10.4 Inspecting history
10.4.1 schema log
Walks the commit DAG from HEAD backwards, printing each commit’s ID, author, timestamp, message, and schema hash. For merge commits, shows all parent IDs.
schema log
schema log -n 5
schema log --oneline # compact one-line format
schema log --graph # visualize branch topology
schema log --all # show all branches10.4.2 schema show
Inspects any object in the store by ref name or object ID. For commits, shows the schema ID, parent IDs, migration ID, protocol, author, and message. For schemas, shows vertex and edge counts.
schema show main
schema show v1.0
schema show --stat main # change summary statistics10.4.3 schema blame
Shows which commit introduced a specific schema element: a vertex, an edge, or a constraint. It walks the DAG backwards from HEAD, comparing each commit’s schema to its parent’s. You can ask “who added the user_status vertex?” or “when was the maxLength constraint on text changed?”
schema blame --element-type vertex user_status
schema blame --element-type constraint "text:maxLength"10.5 Branching and merging
10.5.1 schema branch
Creates, lists, or deletes branches. Branches are lightweight pointers to commit IDs.
schema branch # list all branches
schema branch feature # create a branch
schema branch -d feature # delete (safe: must be merged)
schema branch -D feature # force-delete
schema branch -m old-name new-name # rename10.5.2 schema checkout
Switches HEAD to the named branch, or detaches HEAD at a specific commit.
schema checkout feature
schema checkout -b feature # create and switch in one step
schema checkout --detach main # detach HEAD at a specific ref10.5.3 schema rebase
Replays the current branch’s commits onto another branch. It finds the merge base, collects all commits from the base to HEAD, then replays each one on top of the target via three-way merge.
Each replay step uses structural three-way merge rather than textual merge. This detects schema-level conflicts (like incompatible kind changes) that git rebase would miss.
schema rebase main10.5.4 schema cherry-pick
Applies a single commit’s schema change to the current branch. It extracts the diff between the commit and its parent, then three-way merges it onto HEAD.
schema cherry-pick abc1234...
schema cherry-pick -n abc1234... # apply without committing
schema cherry-pick -x abc1234... # record source commit in messageschema merge claims the structural merge is commutative: merging branch A into B produces the same schema as merging B into A. Does this imply the migration morphisms from the base to each merge result are also identical?
The schemas are identical, but the morphism provenance can differ. Merging A into B produces composition legs \(A \to M\) and \(B \to M\). Merging B into A produces legs \(B \to M'\) and \(A \to M'\). If \(M = M'\) (same merged schema), the legs are the same pair of morphisms, just listed in opposite order. The provenance metadata differs, but the mathematical content is identical.
10.6 Recovery
10.6.1 schema reset
Moves HEAD to a different commit. Three modes match git:
--soft: moves the ref only; the index is unchanged--mixed(default): moves the ref and clears the index--hard: moves the ref, clears the index, and overwrites the working schema
All modes append a reflog entry, so the old position is recoverable.
schema reset --soft v1.0
schema reset --hard HEAD~310.6.2 schema reflog
Shows the history of mutations to a ref. Every commit, merge, checkout, reset, rebase, and cherry-pick appends an entry recording the old and new ref values. If a reset or rebase moves HEAD past a commit you need, the reflog has the old ID.
schema reflog
schema reflog main
schema reflog --all # show reflogs for all refs10.6.3 schema bisect
Binary search for the commit that introduced a breaking change. Given a known-good and known-bad commit, bisect finds the path between them and presents the midpoint for testing. It converges in \(O(\log n)\) steps.
schema bisect v1.0 HEAD10.7 Other commands
10.7.1 schema tag
Creates, lists, or deletes tags. Tags are immutable pointers to commits, typically used for schema releases.
schema tag v1.0
schema tag -d v1.0
schema tag -a v1.0 -m "release" # annotated tag10.7.2 schema gc
Marks all objects reachable from branches, tags, and HEAD, then deletes everything else.
schema gc
schema gc --dry-run # preview without deleting10.8 Quick start
schema init
schema add schema.json
schema commit -m "initial ATProto schema"
schema branch add-verification
schema checkout add-verification
schema add schema-v2.json
schema commit -m "add verification status field"
schema checkout main
schema merge add-verificationEach commit stores a complement \(C\) for backward migration. If a forward migration \(f: G_1 \to G_2\) adds a required field with no default, what does \(C\) contain? Can every forward migration be reversed?
The complement \(C\) is empty for this direction. In a forward migration that adds a required field, the new field’s value is provided (via a default), not discarded. The complement records what get threw away, and in the forward direction nothing is thrown away. Reversal of this migration (going backward) would discard the added field, and that direction’s complement would store the discarded value. Not every forward migration can be reversed.
10.9 A note on VCS inspiration
panproto’s schema version control draws on two traditions. The first is git itself: content-addressing, DAG-structured history, lightweight branches, and the reflog. The second is patch-theory version control systems like Pijul and Darcs, which model changes as first-class mathematical objects.2
The payoff is that structure-aware diffs compose better than text diffs. A schema migration composes with another migration to produce a valid migration. A text patch composed with another text patch might produce gibberish. This guarantees data integrity where git’s merge can only guarantee syntactic non-conflict.
10.9.1 Remote commands (planned)
The commands schema remote, schema push, schema pull, schema fetch, and schema clone are reserved for future distributed operations. Currently they return an error indicating that schema repositories are local-only.
The structural merge is a categorical pushout of the two schemas over their common ancestor. For every schema element (vertices, edges, constraints, hyper-edges, variants, orderings, recursion points, usage modes), the merge classifies each side’s change as unchanged, added, removed, or modified. The merge is commutative: swapping the two branches produces an identical result.↩︎
Pijul’s patches form a category where composition is associative. panproto’s migrations are morphisms in a category of schemas. Both systems benefit from the same insight: when your “diffs” are mathematical objects with well-defined composition rules, merge conflicts become algebraic problems rather than heuristic ones.↩︎