flowchart LR
S[Source Instance] -->|get| V[View]
S -->|get| C[Complement]
V -->|put| R[Restored Instance]
C -->|put| R
R ---|should equal| S
10 panproto-lens: Bidirectional Lenses
The Complement struct captures everything that a forward projection discards. When get projects a source instance onto a target view, some data is inevitably lost: nodes without counterparts in the target schema, arcs connecting dropped nodes, fans whose children were pruned, and structural decisions made during ancestor contraction. The Complement records all of it.
/// The complement: data discarded by `get`, needed by `put` to restore the
/// original source instance.
///
/// When `get` projects a source instance to a target view, some nodes, arcs,
/// and structural decisions are lost. The complement records all of this so
/// that `put` can reconstruct the full source.
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct Complement {
/// Nodes from the source that do not appear in the target view.
pub dropped_nodes: HashMap<u32, Node>,
/// Arcs from the source that do not appear in the target view.
pub dropped_arcs: Vec<(u32, u32, Edge)>,
/// Fans from the source whose parent or children were dropped during `get`.
pub dropped_fans: Vec<Fan>,
/// Resolver decisions made during ancestor contraction.
pub contraction_choices: HashMap<(u32, u32), Edge>,
/// Original parent mapping before contraction.
pub original_parent: HashMap<u32, u32>,
/// Fingerprint of the source schema at `get` time, used by `put` to
/// validate that the complement matches the lens's source schema.
#[serde(default)]
pub source_fingerprint: u64,
/// Pre-transform `extra_fields` for nodes that had `field_transforms` applied.
/// Used by `put` to restore original field values.
#[serde(default, skip_serializing_if = "HashMap::is_empty")]
pub original_extra_fields: HashMap<u32, HashMap<String, panproto_inst::value::Value>>,
/// Exact edge used for every arc in the view, keyed by `(parent_id, child_id)`.
/// This makes `put` deterministic when the source schema has parallel edges
/// between the same vertex pair, ensuring the cartesian lift is unique.
#[serde(default, skip_serializing_if = "HashMap::is_empty")]Each field records a specific category of lost information:
dropped_nodes: a map from node ID toNodefor every source node that doesn’t appear in the target view. These are the nodes whose anchor vertex wasn’t in the migration’s surviving set.dropped_arcs: source arcs where either the parent or child node was dropped. These arcs have no representation in the view.dropped_fans: source fans (hyper-edge instances) where the parent or any child was dropped duringget. Fans model multi-arity relationships in protocols like SQL.contraction_choices: whengetcontracts intermediate vertices (connecting grandparent to grandchild directly), multiple edges may be candidates. This map records which edge was chosen for each contracted pair, soputcan undo the contraction correctly.original_parent: the parent mapping for surviving nodes before any contraction occurred. Duringput, this enables restoration of the original tree topology even when contraction changed the parent-child relationships.
A complement is empty when the transformation is lossless (e.g., a pure rename). The is_empty method provides a quick check. When the complement is non-empty, it must be stored alongside the view so that put can later restore the original.
In production systems, the complement is typically stored alongside the view in a metadata sidecar: a separate column in a database, a companion file, or an opaque blob attached to the record. The complement is serializable via serde, so it can be stored in any format that supports Serialize/Deserialize.
10.1 Forward direction: get
Formally, \(\mathrm{get}: S \to V \times C\) where \(S\) is the source instance type, \(V\) is the view type, and \(C\) is the complement type. The function runs the forward (restrict) pipeline, producing both the projected view and its complement.
/// Pre-coercion `node.value` for nodes that had `__value__` field transforms applied.
/// Used by `put()` to restore the original leaf value.
#[serde(default, skip_serializing_if = "HashMap::is_empty")]
pub original_values: HashMap<u32, Option<panproto_inst::value::FieldPresence>>,
}
impl Complement {
/// Create an empty complement (no data discarded).
#[must_use]
pub fn empty() -> Self {
Self {Internally, get delegates to wtype_restrict from panproto-inst to compute the view, then calculates the set difference between the source and result to populate each complement field:
- View computation: calls
wtype_restrictwith the lens’s compiled migration. This produces the projected instance under the target schema. - Dropped nodes: iterates over source nodes, collecting any not present in the view.
- Dropped arcs: iterates over source arcs, collecting any where either endpoint was dropped.
- Contraction choices: for each arc in the view that connects two nodes not directly connected in the source, records the resolver edge that was used.
- Original parents: for each surviving node in the view, records what its parent was in the source instance (before contraction may have changed the parent).
- Dropped fans: iterates over source fans, collecting any where the parent or a child was dropped.
The resulting (WInstance, Complement) pair contains all the information from the original source, just partitioned into “visible” (the view) and “invisible” (the complement).
10.2 Backward direction: put
Formally, \(\mathrm{put}: V \times C \to S\). The function restores a source instance from a (possibly modified) view and the stored complement. The reconstruction proceeds in stages:
- Copy view nodes: all nodes from the view are copied, with their anchors un-remapped from target vertex IDs back to source vertex IDs using the inverse of the compiled migration’s vertex remap.
- Re-insert dropped nodes: every node recorded in the complement’s
dropped_nodesis inserted back into the node set. - Rebuild arcs from original parents: for each surviving node, the
original_parentmapping tells us who its parent was in the source. The function finds the appropriate edge (consultingcontraction_choicesfor disambiguation) and creates an arc. - Re-insert dropped arcs: arcs from the complement are added back, provided both endpoints are present in the restored node set.
- Reconstruct fans: view fans are un-remapped; dropped fans whose nodes are all present are re-inserted.
The result is a full source instance that incorporates any edits made to the view while preserving the structural elements that were invisible to the view.
If the view was modified between get and put (e.g., a field value was changed by an API consumer), those modifications are preserved in the restored instance. The complement provides the structure that the view can’t represent; the view provides the content that may have been updated.
10.3 The get/put round-trip
The complement acts as a side channel, carrying exactly the information that the view can’t represent. Together, the view and complement contain all the information in the source, just partitioned differently.
Consider a concrete example. Suppose a source schema has fields {name, email, age} and the target schema has only {name, email}. The get direction produces a view with {name, email} and a complement containing the age node, its arc from the parent, and any constraints. If an API consumer updates email in the view, calling put with the modified view and the original complement produces a source instance with the updated email, the original name, and the original age. Nothing is lost.
10.4 Lens laws
panproto lenses satisfy two laws. For the tutorial-level explanation, see the tutorial. We focus here on the implementation of the property tests.
GetPut law: \(\mathrm{put}(\mathrm{get}(s).\mathrm{view},\; \mathrm{get}(s).\mathrm{complement}) = s\)
PutGet law: \(\mathrm{get}(\mathrm{put}(v, c)) = (v, c)\)
10.4.1 Property testing
These laws are property-tested via the check_laws function, which:
- Generates random source schemas using the protocol’s theory.
- Generates random source instances conforming to those schemas.
- Constructs lenses from random combinator chains.
- Verifies both GetPut and PutGet for each generated triple.
The property tests run as part of the standard cargo test suite and also in the CI pipeline with extended iteration counts.
The check_laws function returns a LawViolation struct with the source instance, the view, the complement, and which law failed. Reproduce the failure with the printed instance, then bisect through the get/put stages to find which stage introduced the discrepancy. Most violations trace to a missing contraction_choices entry or an incorrect original_parent mapping.
10.5 Cambria-style combinators
Rather than constructing Migration objects and compiling them by hand, the combinator API lets you describe schema transformations as a chain of atomic operations. Each combinator represents a single, well-understood schema change.
The combinator semantics described below are implemented as elementary protolens constructors in protolens::elementary (e.g., elementary::rename_sort, elementary::drop_sort, elementary::add_sort). See Section 19.7 for the full list.
// Conceptual combinator enum (implemented via protolens::elementary)
pub enum Combinator {
RenameField { old: String, new: String },
AddField { name: String, kind: String, default: Value },
RemoveField { name: String },
WrapInObject { field_name: String },
HoistField { host: String, field: String },
CoerceType { from_kind: String, to_kind: String },
Compose(Box<Combinator>, Box<Combinator>),
}10.5.1 Combinator semantics
Each combinator has a precise effect on the schema and a defined impact on the complement:
| Combinator | Effect | Lossless? | Complement Impact |
|---|---|---|---|
RenameField |
Changes an edge label | Yes | Empty complement |
AddField |
Introduces a new vertex with default | Yes | Default recorded |
RemoveField |
Drops a vertex and incident edges | No | Dropped nodes/arcs stored |
WrapInObject |
Inserts intermediate object vertex | No | Re-parenting tracked |
HoistField |
Moves nested field to grandparent | No | Original topology tracked |
CoerceType |
Changes kind of matching vertices | Depends | Kind mapping stored if lossy |
Compose |
Sequential composition | Depends | Union of sub-complements |
RenameField is the simplest combinator: it changes an edge’s name field from old to new. No data is lost because the edge still connects the same vertices. The complement is empty.
AddField introduces a new vertex (with a specified kind and default value) connected to the root by a new edge. During get, the new field appears in the view with its default value. During put, if the view’s value differs from the default, that value is preserved.
RemoveField drops a vertex and all edges incident on it. The complement stores the dropped node and its arcs. This is the canonical lossy combinator.
WrapInObject creates a new intermediate object vertex and re-parents existing children of the root under it. The complement tracks the original parent-child relationships.
HoistField moves a field from a nested object up to its grandparent. The complement tracks the original nesting so put can restore it.
CoerceType changes the kind of all vertices matching from_kind to to_kind. Whether this is lossless depends on the kinds involved. string to text might be lossless; float to integer is not.
Compose enables sequential composition of two combinators within the enum itself, providing a recursive structure for complex transformations.
10.5.2 Naming combinators
In addition to the six original combinators, seven naming combinators operate on the nine naming sites described in Section 18.2:
| Combinator | Site | Effect |
|---|---|---|
RenameVertex |
VertexId |
Rename vertex ID, cascading to all edges, constraints, variants |
RenameKind |
VertexKind |
Change a single vertex’s kind |
RenameEdgeKind |
EdgeKind |
Rename edge kind across all matching edges |
RenameNsid |
Nsid |
Change the NSID on a specific vertex |
RenameConstraintSort |
ConstraintSort |
Rename constraint sort across all constraints |
ApplyTheoryMorphism |
Multiple | Apply theory-level sort/op renames (morphism tower) |
Rename |
Any | Unified dispatcher for any NameSite |
All naming combinators are lossless (empty complement). RenameVertex is the most complex: it cascades to edges, constraints, required sets, variants, recursion points, spans, hyper-edges, and nominal markers. ApplyTheoryMorphism uses TheoryMorphism::induce_schema_renames() to cascade sort renames to vertex kinds and op renames to edge kinds.
In build_compiled_migration, RenameVertex and Rename { site: VertexId, .. } produce vertex_remap and edge_remap entries. All other naming combinators produce no structural migration impact; they change schema metadata only.
10.5.3 Building a lens from combinators
The from_combinators function takes a source schema and a slice of combinators, applies each in sequence, and produces a fully compiled Lens:
- For each combinator, derive the next schema by applying the atomic transformation.
- Build a compiled migration for that step.
- Compose the step migration with the accumulated result.
- Return a
Lenscontaining the composed compiled migration, the original source schema, and the final target schema.
You never need to manually construct vertex maps, edge maps, or resolvers when using combinators. The target schema and migration are derived automatically from the combinator chain.
Combinators are inspired by the Cambria lens language (Clarke 2020). panproto’s combinators operate on the graph-level schema representation rather than on JSON paths, making them protocol-agnostic. The same combinator chain works for JSON Schema, Protobuf, SQL, or any other supported protocol.
10.6 Complement storage strategies
In production deployments, the complement must be persisted alongside the view. Common strategies:
- Database column: store the serialized complement as a JSONB or BLOB column alongside the view data. This is the simplest approach and works well when the complement is small relative to the view.
- Companion file: for file-based storage (e.g., Parquet datasets), store the complement in a sidecar file with a
.complement.jsonextension. - Metadata header: for protocols with metadata support (e.g., HTTP responses), encode the complement in a custom header or trailer.
- Separate table: for relational databases, store complements in a dedicated table keyed by record ID and schema version. This avoids schema modifications to the main table.
The Complement type derives Serialize and Deserialize, so any serde-compatible format works: JSON, MessagePack, CBOR, etc.
10.7 Interaction with breaking change detection
Lenses and breaking change detection (Chapter 11) are complementary:
- A
RemoveFieldcombinator always produces a non-empty complement. The breaking change detector (panproto-check) will classify the corresponding schema change as breaking (vertex removal). - A
RenameFieldcombinator produces an empty complement. The breaking change detector will see the rename as a vertex removal plus vertex addition, classifying the removal as breaking. To avoid this, use the migration engine’s rename support instead. - A
CoerceTypecombinator may or may not be breaking depending on the kinds involved. The breaking change detector classifies all kind changes as breaking, which aligns with the conservative interpretation.
In practice, you can use panproto-check to decide whether a lens is necessary: if the compatibility report says “fully compatible,” a migration suffices. If it says “breaking,” a lens with complement tracking is the appropriate tool.
auto_generate vs. manual combinator chains?
Use auto_generate when the schema diff is simple (renames, drops, additions) and you want a quick lens without manual work. Use manual combinator chains when you need precise control over contraction resolvers, non-obvious coercions, or multi-step restructurings. auto_generate calls diff_to_lens internally, which may not handle complex restructurings the way you want.
10.8 Protolens modules
The panproto-lens crate includes a protolens layer that lifts concrete lenses to schema-parameterized families. The protolens layer operates at the theory level, deriving lens families that can be instantiated against any schema conforming to the theory.
10.8.1 protolens.rs
File: crates/panproto-lens/src/protolens.rs
Exports:
Protolens: a schema-parameterized lens. Given any schema that satisfies a precondition, it produces a concrete lens. Categorically, this is a natural transformation between theory endofunctors.1ProtolensChain: a sequence of protolenses for vertical composition.ComplementConstructor: a type describing how the complement varies with the schema.vertical_compose/horizontal_compose: composition operations.elementary::*: nine atomic protolens constructors (add_sort,drop_sort,rename_sort,add_op,drop_op,rename_op,add_equation,drop_equation,pullback).
The Protolens::instantiate method is the complete operation: given a schema and protocol, it applies the source and target transformations to compute source and target schemas, derives a CompiledMigration between them, and returns a concrete Lens.2
For full details, see Chapter 19.
10.8.2 Relationship to cambria-style combinators
The conceptual Cambria-style combinators described above (rename, add, remove, etc.) are implemented as elementary protolens constructors. Each combinator maps to a corresponding protolens::elementary function that operates at the theory level, producing a Protolens that can be instantiated against any schema satisfying its precondition. The diff_to_protolens module provides automated derivation of protolens chains from schema diffs.
10.9 Optic classification
File: crates/panproto-lens/src/optic.rs
Not every schema change is a lens. The OpticKind enum classifies protolens chains into five optic types, enabling complement optimization:
| Kind | Complement | When |
|---|---|---|
Iso |
Unit (none needed) | All-rename chains, lossless coercions |
Lens |
Dropped data | Any sort/op drops, additions with defaults |
Prism |
Variant tag | Accessing a field in a union type |
Affine |
(variant tag, dropped data) | Lens composed with Prism |
Traversal |
Position list | Mapping over array elements |
The OpticKind::compose method implements the standard optics lattice composition rules: Iso is the identity element, Traversal absorbs everything, and mixing Lens with Prism yields Affine.
The classify_transform function maps each TheoryTransform to its optic kind. When a protolens chain is classified as Iso, the engine skips complement computation entirely. This is a significant optimization for the common case where schema changes are renames and lossless coercions.
10.10 Symbolic simplification
File: crates/panproto-lens/src/symbolic.rs
Protolens chains composed from multiple schema migrations often contain redundant steps. The simplify_steps function applies algebraic rewrite rules in a fixpoint loop:
- Inverse cancellation:
rename(A,B)followed byrename(B,A)is removed. - Rename fusion:
rename(A,B)followed byrename(B,C)becomesrename(A,C). - Add-drop cancellation:
add(X)followed bydrop(X)is removed.
A chain composed from three separate schema migrations might have 15 steps, many of which cancel or fuse. Symbolic simplification reduces this to the minimal equivalent chain before any data is touched.
10.11 Module map
| Module | Exports | Purpose |
|---|---|---|
asymmetric |
Complement, get, put |
Core lens primitives with complement tracking |
auto_lens |
auto_generate, AutoLensConfig |
Automatic lens generation from schema pairs |
compose |
compose |
Composition of lenses for multi-step transformations |
cost |
complement_cost, chain_cost |
Lawvere metric cost computation |
diff_to_protolens |
diff_to_protolens, diff_to_lens |
Derive protolens chains from schema diffs |
graph |
LensGraph |
Weighted lens graph with Floyd-Warshall shortest paths |
laws |
check_laws |
Property-based testing of GetPut and PutGet laws |
protolens |
Protolens, ProtolensChain, ComplementConstructor, elementary::* |
Schema-parameterized lens families |
symmetric |
SymmetricLens |
Symmetric lens with bidirectional complements |
error |
LensError |
Error types for lens operations |
The crate root re-exports Lens, Complement, get, put, Protolens, ProtolensChain, ComplementConstructor, auto_generate, and the composition functions.
The cost and graph modules implement a Lawvere metric on protolens chains. Each ComplementConstructor has a numeric cost; LensGraph uses Floyd-Warshall to find the cheapest conversion path between any two schemas. For full details, see Chapter 25.
10.12 When to use lenses vs. migrations
| Scenario | Use Migration | Use Lens |
|---|---|---|
| Rename a field | Yes (bijective) | Yes (but unnecessary overhead) |
| Reorder fields | Yes (bijective) | Yes (but unnecessary overhead) |
| Remove a field | No (not invertible) | Yes (complement stores removed data) |
| Add a required field | Possible (with default) | Yes (complement stores default) |
| Change a field’s type | No (not invertible) | Yes (if coercion is defined) |
| Wrap fields in a new object | No (not invertible) | Yes (complement tracks re-parenting) |
| Multi-step evolution | Compose migrations | Chain combinators |
As a rule of thumb: if panproto-mig’s invert would succeed on your migration, use a migration. If it would fail, use a lens.
10.13 Performance considerations
Lens operations are slightly more expensive than plain migrations because of the complement computation:
getperforms onewtype_restrictcall (same cost aslift_wtype) plus a set-difference pass over the source instance to compute the complement. The overhead is \(O(|S|)\) where \(S\) is the source node count.putperforms a reconstruction pass that is \(O(|V| + |C|)\) where \(V\) is view node count and \(C\) is complement node count, with hash-map lookups for un-remapping and contraction choice disambiguation.- Complement serialization depends on the complement size, which is proportional to the amount of data dropped during
get.
For lossless transformations (empty complement), get and put have essentially the same cost as a migration lift and its inverse.
In batch scenarios where the same lens is applied to many instances, construct the lens once and reuse it. The Lens struct is Clone and can be shared across threads via Arc<Lens>. The compiled migration inside the lens is immutable after construction, so concurrent get calls are safe without synchronization.
The complement size is the primary cost driver. A lens that drops a single leaf field produces a small complement (one node, one arc). A lens that removes an entire subtree produces a complement proportional to the subtree size. When designing lens-based APIs, consider whether the complement can be compressed or whether a more targeted lens (removing only the specific fields needed) would reduce overhead.
10.14 Batch mode and incremental mode
The panproto-lens crate supports two modes of operation. Batch mode (the Lens struct with get/put) operates on whole instances: the forward direction projects an entire source instance to a view and complement, and the backward direction restores the source from a modified view and stored complement. This is the mode used by VCS data migration (schema data migrate), format conversion, and law verification.
Incremental mode (the EditLens struct with get_edit/put_edit) operates on individual edits (TreeEdit values). Each edit flows through the lens, and the complement updates as a state machine. This mode is used by schema data sync --edits for live synchronization scenarios where re-migrating the entire data set on every edit would be impractical.
Both modes share the same compiled migration, vertex/edge remap tables, and complement structure. EditLens wraps a Lens and adds the incremental translation pipeline. The edit lens laws (Consistency and Complement coherence) guarantee that the incremental mode agrees with the batch mode.
For full details on the EditLens architecture, the five-step EditPipeline, complement state machine transitions, and law verification, see Chapter 26.
10.15 Error handling
The panproto-lens error types cover failures in both the combinator and primitive layers:
LensError::Restrict: the underlyingwtype_restrictcall failed duringget. This typically indicates a mismatch between the lens’s schemas and the instance.LensError::ComplementMismatch: duringput, the complement is inconsistent with the view. This can happen if the complement was generated by a different lens or a different source instance.LensError::FieldNotFound: a combinator references a field name not present in the schema.LensError::VertexNotFound: a combinator references a vertex ID not present in the schema.LensError::IncompatibleCoercion: aCoerceTypecombinator references a kind not present in any vertex.
See the tutorial’s Appendix A.↩︎
In dependent type theory, this is \(\Pi\)-elimination.↩︎