Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Migrations as morphisms

In plain terms

A migration is what gets you from version 3 of your schema to version 4, with all your existing version-3 data carried forward into version-4 shape. Hand-written migrations are usually a script that walks each record, renames fields, fills in defaults, drops what is no longer there.

panproto represents a migration as a structured map between the two schemas: for every vertex (record kind) in the new schema, where does it come from in the old one; for every edge (field, item, variant), how is it derived. The structured map is called a morphism. Once you have a morphism, two operations follow:

  • Restrict moves the morphism backwards: it tells you what part of the old schema is needed to produce a given part of the new one.
  • Lift moves data forwards: it takes a record that conforms to the old schema and produces a record that conforms to the new one, using the morphism to know what to put where.

Lift is the operation you usually want; restrict is the operation panproto uses internally to figure out which old fields are required for the migration to succeed. Both are total functions on the things they apply to; if a migration cannot be lifted (because some required input is missing), panproto-check says so up front rather than failing partway through.

The formal picture

Schemas live in a category whose objects are schema theories and whose morphisms are theory morphisms. A migration from schema to schema is a morphism in this category. The migration engine is split into two functors:

  • Restrict : pulls a -instance back to an -instance along . Used to check existence conditions: which -records does require to be present?
  • Lift : pushes an -instance forward to a -instance along . Used to actually migrate data.

The pair forms an adjunction in the categories of instances: lift is left adjoint to restrict. (This follows the convention in Spivak & Wisnesky (2015), where is the dependent sum over the fibres of and is its right adjoint by base change.) The adjunction is the structure that lets panproto check, before any data moves, whether a migration is well-defined.

A migration may also include a value-level transform: not just where a field comes from, but how its value is computed. These are written in the expression language and applied during lift.

Three classes of migration

ClassDiff classificationExampleLift behaviour
Fully compatibleRefinementAdd an optional field with a default.Total; old records lift unchanged.
Backward compatibleInclusionAdd a required field whose value is computed from existing fields.Total; old records lift via the value-level transform.
BreakingNeitherRemove a required field with no recovery.Partial; some old records cannot be lifted. panproto-check flags this.

panproto-check runs the existence check on a migration without applying it, so CI can gate on the classification before merge. See Breaking-change gate.

What this gives you

  • You write the migration map (what goes where), not the migration script (how to execute it). The script is generated from the map and the schema theories.
  • You get a precise classification of whether the change is breaking, before any data is touched.
  • You get a bidirectional artifact: the lift forward is paired with a put backward, so the migration is also a lens. Round-trip laws apply.
  • Migrations compose. If and , then is a migration whose lift agrees with applying ’s lift then ’s. The migration history of a schema is a chain of these compositions.

See also