10 panproto-lens: Bidirectional Lenses

The Complement struct captures everything that a forward projection discards. When get projects a source instance onto a target view, some data is inevitably lost: nodes without counterparts in the target schema, arcs connecting dropped nodes, fans whose children were pruned, and structural decisions made during ancestor contraction. The Complement records all of it.


/// The complement: data discarded by `get`, needed by `put` to restore the
/// original source instance.
///
/// When `get` projects a source instance to a target view, some nodes, arcs,
/// and structural decisions are lost. The complement records all of this so
/// that `put` can reconstruct the full source.
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct Complement {
    /// Nodes from the source that do not appear in the target view.
    pub dropped_nodes: HashMap<u32, Node>,
    /// Arcs from the source that do not appear in the target view.
    pub dropped_arcs: Vec<(u32, u32, Edge)>,
    /// Fans from the source whose parent or children were dropped during `get`.
    pub dropped_fans: Vec<Fan>,
    /// Resolver decisions made during ancestor contraction.
    pub contraction_choices: HashMap<(u32, u32), Edge>,
    /// Original parent mapping before contraction.
    pub original_parent: HashMap<u32, u32>,
    /// Fingerprint of the source schema at `get` time, used by `put` to
    /// validate that the complement matches the lens's source schema.
    #[serde(default)]
    pub source_fingerprint: u64,
    /// Pre-transform `extra_fields` for nodes that had `field_transforms` applied.
    /// Used by `put` to restore original field values.
    #[serde(default, skip_serializing_if = "HashMap::is_empty")]
    pub original_extra_fields: HashMap<u32, HashMap<String, panproto_inst::value::Value>>,
    /// Exact edge used for every arc in the view, keyed by `(parent_id, child_id)`.
    /// This makes `put` deterministic when the source schema has parallel edges
    /// between the same vertex pair, ensuring the cartesian lift is unique.
    #[serde(default, skip_serializing_if = "HashMap::is_empty")]

Each field records a specific category of lost information:

dropped_nodes: a map from node ID to Node for every source node that doesn’t appear in the target view. These are the nodes whose anchor vertex wasn’t in the migration’s surviving set.
dropped_arcs: source arcs where either the parent or child node was dropped. These arcs have no representation in the view.
dropped_fans: source fans (hyper-edge instances) where the parent or any child was dropped during get. Fans model multi-arity relationships in protocols like SQL.
contraction_choices: when get contracts intermediate vertices (connecting grandparent to grandchild directly), multiple edges may be candidates. This map records which edge was chosen for each contracted pair, so put can undo the contraction correctly.
original_parent: the parent mapping for surviving nodes before any contraction occurred. During put, this enables restoration of the original tree topology even when contraction changed the parent-child relationships.

A complement is empty when the transformation is lossless (e.g., a pure rename). The is_empty method provides a quick check. When the complement is non-empty, it must be stored alongside the view so that put can later restore the original.

Tip

In production systems, the complement is typically stored alongside the view in a metadata sidecar: a separate column in a database, a companion file, or an opaque blob attached to the record. The complement is serializable via serde, so it can be stored in any format that supports Serialize/Deserialize.

10.1 Forward direction: get

Formally, $\mathrm{get}: S \to V \times C$ where $S$ is the source instance type, $V$ is the view type, and $C$ is the complement type. The function runs the forward (restrict) pipeline, producing both the projected view and its complement.

    /// Pre-coercion `node.value` for nodes that had `__value__` field transforms applied.
    /// Used by `put()` to restore the original leaf value.
    #[serde(default, skip_serializing_if = "HashMap::is_empty")]
    pub original_values: HashMap<u32, Option<panproto_inst::value::FieldPresence>>,
}

impl Complement {
    /// Create an empty complement (no data discarded).
    #[must_use]
    pub fn empty() -> Self {
        Self {

Internally, get delegates to wtype_restrict from panproto-inst to compute the view, then calculates the set difference between the source and result to populate each complement field:

View computation: calls wtype_restrict with the lens’s compiled migration. This produces the projected instance under the target schema.
Dropped nodes: iterates over source nodes, collecting any not present in the view.
Dropped arcs: iterates over source arcs, collecting any where either endpoint was dropped.
Contraction choices: for each arc in the view that connects two nodes not directly connected in the source, records the resolver edge that was used.
Original parents: for each surviving node in the view, records what its parent was in the source instance (before contraction may have changed the parent).
Dropped fans: iterates over source fans, collecting any where the parent or a child was dropped.

The resulting (WInstance, Complement) pair contains all the information from the original source, just partitioned into “visible” (the view) and “invisible” (the complement).

10.2 Backward direction: put

Formally, $\mathrm{put}: V \times C \to S$. The function restores a source instance from a (possibly modified) view and the stored complement. The reconstruction proceeds in stages:

Copy view nodes: all nodes from the view are copied, with their anchors un-remapped from target vertex IDs back to source vertex IDs using the inverse of the compiled migration’s vertex remap.
Re-insert dropped nodes: every node recorded in the complement’s dropped_nodes is inserted back into the node set.
Rebuild arcs from original parents: for each surviving node, the original_parent mapping tells us who its parent was in the source. The function finds the appropriate edge (consulting contraction_choices for disambiguation) and creates an arc.
Re-insert dropped arcs: arcs from the complement are added back, provided both endpoints are present in the restored node set.
Reconstruct fans: view fans are un-remapped; dropped fans whose nodes are all present are re-inserted.

The result is a full source instance that incorporates any edits made to the view while preserving the structural elements that were invisible to the view.

Note

If the view was modified between get and put (e.g., a field value was changed by an API consumer), those modifications are preserved in the restored instance. The complement provides the structure that the view can’t represent; the view provides the content that may have been updated.

10.3 The get/put round-trip

flowchart LR
    S[Source Instance] -->|get| V[View]
    S -->|get| C[Complement]
    V -->|put| R[Restored Instance]
    C -->|put| R
    R ---|should equal| S

The complement acts as a side channel, carrying exactly the information that the view can’t represent. Together, the view and complement contain all the information in the source, just partitioned differently.

Consider a concrete example. Suppose a source schema has fields {name, email, age} and the target schema has only {name, email}. The get direction produces a view with {name, email} and a complement containing the age node, its arc from the parent, and any constraints. If an API consumer updates email in the view, calling put with the modified view and the original complement produces a source instance with the updated email, the original name, and the original age. Nothing is lost.

10.4 Lens laws

panproto lenses satisfy two laws. For the tutorial-level explanation, see the tutorial. We focus here on the implementation of the property tests.

GetPut law: $\mathrm{put}(\mathrm{get}(s).\mathrm{view},\; \mathrm{get}(s).\mathrm{complement}) = s$

PutGet law: $\mathrm{get}(\mathrm{put}(v, c)) = (v, c)$

10.4.1 Property testing

These laws are property-tested via the check_laws function, which:

Generates random source schemas using the protocol’s theory.
Generates random source instances conforming to those schemas.
Constructs lenses from random combinator chains.
Verifies both GetPut and PutGet for each generated triple.

The property tests run as part of the standard cargo test suite and also in the CI pipeline with extended iteration counts.

What if a law violation is found?

The check_laws function returns a LawViolation struct with the source instance, the view, the complement, and which law failed. Reproduce the failure with the printed instance, then bisect through the get/put stages to find which stage introduced the discrepancy. Most violations trace to a missing contraction_choices entry or an incorrect original_parent mapping.

10.5 Cambria-style combinators

Rather than constructing Migration objects and compiling them by hand, the combinator API lets you describe schema transformations as a chain of atomic operations. Each combinator represents a single, well-understood schema change.

Warning

The combinator semantics described below are implemented as elementary protolens constructors in protolens::elementary (e.g., elementary::rename_sort, elementary::drop_sort, elementary::add_sort). See Section 19.7 for the full list.

// Conceptual combinator enum (implemented via protolens::elementary)
pub enum Combinator {
    RenameField { old: String, new: String },
    AddField { name: String, kind: String, default: Value },
    RemoveField { name: String },
    WrapInObject { field_name: String },
    HoistField { host: String, field: String },
    CoerceType { from_kind: String, to_kind: String },
    Compose(Box<Combinator>, Box<Combinator>),
}

10.5.1 Combinator semantics

Each combinator has a precise effect on the schema and a defined impact on the complement:

Combinator	Effect	Lossless?	Complement Impact
`RenameField`	Changes an edge label	Yes	Empty complement
`AddField`	Introduces a new vertex with default	Yes	Default recorded
`RemoveField`	Drops a vertex and incident edges	No	Dropped nodes/arcs stored
`WrapInObject`	Inserts intermediate object vertex	No	Re-parenting tracked
`HoistField`	Moves nested field to grandparent	No	Original topology tracked
`CoerceType`	Changes kind of matching vertices	Depends	Kind mapping stored if lossy
`Compose`	Sequential composition	Depends	Union of sub-complements

RenameField is the simplest combinator: it changes an edge’s name field from old to new. No data is lost because the edge still connects the same vertices. The complement is empty.

AddField introduces a new vertex (with a specified kind and default value) connected to the root by a new edge. During get, the new field appears in the view with its default value. During put, if the view’s value differs from the default, that value is preserved.

RemoveField drops a vertex and all edges incident on it. The complement stores the dropped node and its arcs. This is the canonical lossy combinator.

WrapInObject creates a new intermediate object vertex and re-parents existing children of the root under it. The complement tracks the original parent-child relationships.

HoistField moves a field from a nested object up to its grandparent. The complement tracks the original nesting so put can restore it.

CoerceType changes the kind of all vertices matching from_kind to to_kind. Whether this is lossless depends on the kinds involved. string to text might be lossless; float to integer is not.

Compose enables sequential composition of two combinators within the enum itself, providing a recursive structure for complex transformations.

10.5.2 Naming combinators

In addition to the six original combinators, seven naming combinators operate on the nine naming sites described in Section 18.2:

Combinator	Site	Effect
`RenameVertex`	`VertexId`	Rename vertex ID, cascading to all edges, constraints, variants
`RenameKind`	`VertexKind`	Change a single vertex’s kind
`RenameEdgeKind`	`EdgeKind`	Rename edge kind across all matching edges
`RenameNsid`	`Nsid`	Change the NSID on a specific vertex
`RenameConstraintSort`	`ConstraintSort`	Rename constraint sort across all constraints
`ApplyTheoryMorphism`	Multiple	Apply theory-level sort/op renames (morphism tower)
`Rename`	Any	Unified dispatcher for any `NameSite`

All naming combinators are lossless (empty complement). RenameVertex is the most complex: it cascades to edges, constraints, required sets, variants, recursion points, spans, hyper-edges, and nominal markers. ApplyTheoryMorphism uses TheoryMorphism::induce_schema_renames() to cascade sort renames to vertex kinds and op renames to edge kinds.

In build_compiled_migration, RenameVertex and Rename { site: VertexId, .. } produce vertex_remap and edge_remap entries. All other naming combinators produce no structural migration impact; they change schema metadata only.

10.5.3 Building a lens from combinators

The from_combinators function takes a source schema and a slice of combinators, applies each in sequence, and produces a fully compiled Lens:

For each combinator, derive the next schema by applying the atomic transformation.
Build a compiled migration for that step.
Compose the step migration with the accumulated result.
Return a Lens containing the composed compiled migration, the original source schema, and the final target schema.

You never need to manually construct vertex maps, edge maps, or resolvers when using combinators. The target schema and migration are derived automatically from the combinator chain.

Tip

Combinators are inspired by the Cambria lens language (Clarke 2020). panproto’s combinators operate on the graph-level schema representation rather than on JSON paths, making them protocol-agnostic. The same combinator chain works for JSON Schema, Protobuf, SQL, or any other supported protocol.

10.6 Complement storage strategies

In production deployments, the complement must be persisted alongside the view. Common strategies:

Database column: store the serialized complement as a JSONB or BLOB column alongside the view data. This is the simplest approach and works well when the complement is small relative to the view.
Companion file: for file-based storage (e.g., Parquet datasets), store the complement in a sidecar file with a .complement.json extension.
Metadata header: for protocols with metadata support (e.g., HTTP responses), encode the complement in a custom header or trailer.
Separate table: for relational databases, store complements in a dedicated table keyed by record ID and schema version. This avoids schema modifications to the main table.

The Complement type derives Serialize and Deserialize, so any serde-compatible format works: JSON, MessagePack, CBOR, etc.

10.7 Interaction with breaking change detection

Lenses and breaking change detection (Chapter 11) are complementary:

A RemoveField combinator always produces a non-empty complement. The breaking change detector (panproto-check) will classify the corresponding schema change as breaking (vertex removal).
A RenameField combinator produces an empty complement. The breaking change detector will see the rename as a vertex removal plus vertex addition, classifying the removal as breaking. To avoid this, use the migration engine’s rename support instead.
A CoerceType combinator may or may not be breaking depending on the kinds involved. The breaking change detector classifies all kind changes as breaking, which aligns with the conservative interpretation.

In practice, you can use panproto-check to decide whether a lens is necessary: if the compatibility report says “fully compatible,” a migration suffices. If it says “breaking,” a lens with complement tracking is the appropriate tool.

When should you use auto_generate vs. manual combinator chains?

Use auto_generate when the schema diff is simple (renames, drops, additions) and you want a quick lens without manual work. Use manual combinator chains when you need precise control over contraction resolvers, non-obvious coercions, or multi-step restructurings. auto_generate calls diff_to_lens internally, which may not handle complex restructurings the way you want.

10.8 Protolens modules

The panproto-lens crate includes a protolens layer that lifts concrete lenses to schema-parameterized families. The protolens layer operates at the theory level, deriving lens families that can be instantiated against any schema conforming to the theory.

10.8.1 `protolens.rs`

File: crates/panproto-lens/src/protolens.rs

Exports:

Protolens: a schema-parameterized lens. Given any schema that satisfies a precondition, it produces a concrete lens. Categorically, this is a natural transformation between theory endofunctors.¹
ProtolensChain: a sequence of protolenses for vertical composition.
ComplementConstructor: a type describing how the complement varies with the schema.
vertical_compose / horizontal_compose: composition operations.
elementary::*: nine atomic protolens constructors (add_sort, drop_sort, rename_sort, add_op, drop_op, rename_op, add_equation, drop_equation, pullback).

The Protolens::instantiate method is the complete operation: given a schema and protocol, it applies the source and target transformations to compute source and target schemas, derives a CompiledMigration between them, and returns a concrete Lens.²

For full details, see Chapter 19.

10.8.2 Relationship to cambria-style combinators

The conceptual Cambria-style combinators described above (rename, add, remove, etc.) are implemented as elementary protolens constructors. Each combinator maps to a corresponding protolens::elementary function that operates at the theory level, producing a Protolens that can be instantiated against any schema satisfying its precondition. The diff_to_protolens module provides automated derivation of protolens chains from schema diffs.

10.9 Optic classification

File: crates/panproto-lens/src/optic.rs

Not every schema change is a lens. The OpticKind enum classifies protolens chains into five optic types, enabling complement optimization:

Kind	Complement	When
`Iso`	Unit (none needed)	All-rename chains, lossless coercions
`Lens`	Dropped data	Any sort/op drops, additions with defaults
`Prism`	Variant tag	Accessing a field in a union type
`Affine`	(variant tag, dropped data)	Lens composed with Prism
`Traversal`	Position list	Mapping over array elements

The OpticKind::compose method implements the standard optics lattice composition rules: Iso is the identity element, Traversal absorbs everything, and mixing Lens with Prism yields Affine.

The classify_transform function maps each TheoryTransform to its optic kind. When a protolens chain is classified as Iso, the engine skips complement computation entirely. This is a significant optimization for the common case where schema changes are renames and lossless coercions.

10.10 Symbolic simplification

File: crates/panproto-lens/src/symbolic.rs

Protolens chains composed from multiple schema migrations often contain redundant steps. The simplify_steps function applies algebraic rewrite rules in a fixpoint loop:

Inverse cancellation: rename(A,B) followed by rename(B,A) is removed.
Rename fusion: rename(A,B) followed by rename(B,C) becomes rename(A,C).
Add-drop cancellation: add(X) followed by drop(X) is removed.

A chain composed from three separate schema migrations might have 15 steps, many of which cancel or fuse. Symbolic simplification reduces this to the minimal equivalent chain before any data is touched.

10.11 Module map

Module	Exports	Purpose
`asymmetric`	`Complement`, `get`, `put`	Core lens primitives with complement tracking
`auto_lens`	`auto_generate`, `AutoLensConfig`	Automatic lens generation from schema pairs
`compose`	`compose`	Composition of lenses for multi-step transformations
`cost`	`complement_cost`, `chain_cost`	Lawvere metric cost computation
`diff_to_protolens`	`diff_to_protolens`, `diff_to_lens`	Derive protolens chains from schema diffs
`graph`	`LensGraph`	Weighted lens graph with Floyd-Warshall shortest paths
`laws`	`check_laws`	Property-based testing of GetPut and PutGet laws
`protolens`	`Protolens`, `ProtolensChain`, `ComplementConstructor`, `elementary::*`	Schema-parameterized lens families
`symmetric`	`SymmetricLens`	Symmetric lens with bidirectional complements
`error`	`LensError`	Error types for lens operations

The crate root re-exports Lens, Complement, get, put, Protolens, ProtolensChain, ComplementConstructor, auto_generate, and the composition functions.

The cost and graph modules implement a Lawvere metric on protolens chains. Each ComplementConstructor has a numeric cost; LensGraph uses Floyd-Warshall to find the cheapest conversion path between any two schemas. For full details, see Chapter 25.

10.12 When to use lenses vs. migrations

Scenario	Use Migration	Use Lens
Rename a field	Yes (bijective)	Yes (but unnecessary overhead)
Reorder fields	Yes (bijective)	Yes (but unnecessary overhead)
Remove a field	No (not invertible)	Yes (complement stores removed data)
Add a required field	Possible (with default)	Yes (complement stores default)
Change a field’s type	No (not invertible)	Yes (if coercion is defined)
Wrap fields in a new object	No (not invertible)	Yes (complement tracks re-parenting)
Multi-step evolution	Compose migrations	Chain combinators

As a rule of thumb: if panproto-mig’s invert would succeed on your migration, use a migration. If it would fail, use a lens.

10.13 Performance considerations

Lens operations are slightly more expensive than plain migrations because of the complement computation:

get performs one wtype_restrict call (same cost as lift_wtype) plus a set-difference pass over the source instance to compute the complement. The overhead is $O(|S|)$ where $S$ is the source node count.
put performs a reconstruction pass that is $O(|V| + |C|)$ where $V$ is view node count and $C$ is complement node count, with hash-map lookups for un-remapping and contraction choice disambiguation.
Complement serialization depends on the complement size, which is proportional to the amount of data dropped during get.

For lossless transformations (empty complement), get and put have essentially the same cost as a migration lift and its inverse.

In batch scenarios where the same lens is applied to many instances, construct the lens once and reuse it. The Lens struct is Clone and can be shared across threads via Arc<Lens>. The compiled migration inside the lens is immutable after construction, so concurrent get calls are safe without synchronization.

The complement size is the primary cost driver. A lens that drops a single leaf field produces a small complement (one node, one arc). A lens that removes an entire subtree produces a complement proportional to the subtree size. When designing lens-based APIs, consider whether the complement can be compressed or whether a more targeted lens (removing only the specific fields needed) would reduce overhead.

10.14 Batch mode and incremental mode

The panproto-lens crate supports two modes of operation. Batch mode (the Lens struct with get/put) operates on whole instances: the forward direction projects an entire source instance to a view and complement, and the backward direction restores the source from a modified view and stored complement. This is the mode used by VCS data migration (schema data migrate), format conversion, and law verification.

Incremental mode (the EditLens struct with get_edit/put_edit) operates on individual edits (TreeEdit values). Each edit flows through the lens, and the complement updates as a state machine. This mode is used by schema data sync --edits for live synchronization scenarios where re-migrating the entire data set on every edit would be impractical.

Both modes share the same compiled migration, vertex/edge remap tables, and complement structure. EditLens wraps a Lens and adds the incremental translation pipeline. The edit lens laws (Consistency and Complement coherence) guarantee that the incremental mode agrees with the batch mode.

For full details on the EditLens architecture, the five-step EditPipeline, complement state machine transitions, and law verification, see Chapter 26.

10.15 Error handling

The panproto-lens error types cover failures in both the combinator and primitive layers:

LensError::Restrict: the underlying wtype_restrict call failed during get. This typically indicates a mismatch between the lens’s schemas and the instance.
LensError::ComplementMismatch: during put, the complement is inconsistent with the view. This can happen if the complement was generated by a different lens or a different source instance.
LensError::FieldNotFound: a combinator references a field name not present in the schema.
LensError::VertexNotFound: a combinator references a vertex ID not present in the schema.
LensError::IncompatibleCoercion: a CoerceType combinator references a kind not present in any vertex.

Clarke, Bryce. 2020. “Internal Lenses as Functors and Cofunctors.” Applied Category Theory 2019, Electronic proceedings in theoretical computer science, vol. 323: 183–95. https://doi.org/10.4204/EPTCS.323.13.

See the tutorial’s Appendix A.↩︎
In dependent type theory, this is $\Pi$-elimination.↩︎

# panproto-lens: Bidirectional Lenses {#sec-lenses} The `Complement` struct captures everything that a forward projection discards. When `get` projects a source instance onto a target view, some data is inevitably lost: nodes without counterparts in the target schema, arcs connecting dropped nodes, fans whose children were pruned, and structural decisions made during ancestor contraction. The `Complement` records all of it. ```{.rust include="../../crates/panproto-lens/src/asymmetric.rs" start-line=17 end-line=47} ``` Each field records a specific category of lost information: - **`dropped_nodes`**: a map from node ID to `Node` for every source node that doesn't appear in the target view. These are the nodes whose anchor vertex wasn't in the migration's surviving set. - **`dropped_arcs`**: source arcs where either the parent or child node was dropped. These arcs have no representation in the view. - **`dropped_fans`**: source fans (hyper-edge instances) where the parent or any child was dropped during `get`. Fans model multi-arity relationships in protocols like SQL. - **`contraction_choices`**: when `get` contracts intermediate vertices (connecting grandparent to grandchild directly), multiple edges may be candidates. This map records which edge was chosen for each contracted pair, so `put` can undo the contraction correctly. - **`original_parent`**: the parent mapping for surviving nodes before any contraction occurred. During `put`, this enables restoration of the original tree topology even when contraction changed the parent-child relationships. A complement is empty when the transformation is lossless (e.g., a pure rename). The `is_empty` method provides a quick check. When the complement is non-empty, it must be stored alongside the view so that `put` can later restore the original. ::: {.callout-tip} In production systems, the complement is typically stored alongside the view in a metadata sidecar: a separate column in a database, a companion file, or an opaque blob attached to the record. The complement is serializable via `serde`, so it can be stored in any format that supports `Serialize`/`Deserialize`. ::: ## Forward direction: get Formally, $\mathrm{get}: S \to V \times C$ where $S$ is the source instance type, $V$ is the view type, and $C$ is the complement type. The function runs the forward (restrict) pipeline, producing both the projected view and its complement. ```{.rust include="../../crates/panproto-lens/src/asymmetric.rs" start-line=49 end-line=59} ``` Internally, `get` delegates to `wtype_restrict` from `panproto-inst` to compute the view, then calculates the set difference between the source and result to populate each complement field: 1. **View computation**: calls `wtype_restrict` with the lens's compiled migration. This produces the projected instance under the target schema. 2. **Dropped nodes**: iterates over source nodes, collecting any not present in the view. 3. **Dropped arcs**: iterates over source arcs, collecting any where either endpoint was dropped. 4. **Contraction choices**: for each arc in the view that connects two nodes not directly connected in the source, records the resolver edge that was used. 5. **Original parents**: for each surviving node in the view, records what its parent was in the source instance (before contraction may have changed the parent). 6. **Dropped fans**: iterates over source fans, collecting any where the parent or a child was dropped. The resulting `(WInstance, Complement)` pair contains all the information from the original source, just partitioned into "visible" (the view) and "invisible" (the complement). ## Backward direction: put Formally, $\mathrm{put}: V \times C \to S$. The function restores a source instance from a (possibly modified) view and the stored complement. The reconstruction proceeds in stages: 1. **Copy view nodes**: all nodes from the view are copied, with their anchors un-remapped from target vertex IDs back to source vertex IDs using the inverse of the compiled migration's vertex remap. 2. **Re-insert dropped nodes**: every node recorded in the complement's `dropped_nodes` is inserted back into the node set. 3. **Rebuild arcs from original parents**: for each surviving node, the `original_parent` mapping tells us who its parent was in the source. The function finds the appropriate edge (consulting `contraction_choices` for disambiguation) and creates an arc. 4. **Re-insert dropped arcs**: arcs from the complement are added back, provided both endpoints are present in the restored node set. 5. **Reconstruct fans**: view fans are un-remapped; dropped fans whose nodes are all present are re-inserted. The result is a full source instance that incorporates any edits made to the view while preserving the structural elements that were invisible to the view. ::: {.callout-note} If the view was modified between `get` and `put` (e.g., a field value was changed by an API consumer), those modifications are preserved in the restored instance. The complement provides the _structure_ that the view can't represent; the view provides the _content_ that may have been updated. ::: ## The get/put round-trip ```{mermaid} flowchart LR S[Source Instance] -->|get| V[View] S -->|get| C[Complement] V -->|put| R[Restored Instance] C -->|put| R R ---|should equal| S ``` The complement acts as a side channel, carrying exactly the information that the view can't represent. Together, the view and complement contain all the information in the source, just partitioned differently. Consider a concrete example. Suppose a source schema has fields `{name, email, age}` and the target schema has only `{name, email}`. The `get` direction produces a view with `{name, email}` and a complement containing the `age` node, its arc from the parent, and any constraints. If an API consumer updates `email` in the view, calling `put` with the modified view and the original complement produces a source instance with the updated `email`, the original `name`, and the original `age`. Nothing is lost. ## Lens laws panproto lenses satisfy two laws. For the tutorial-level explanation, see [the tutorial](https://panproto.dev/tutorial/chapters/08-lenses.html). We focus here on the implementation of the property tests. **GetPut law**: $\mathrm{put}(\mathrm{get}(s).\mathrm{view},\; \mathrm{get}(s).\mathrm{complement}) = s$ **PutGet law**: $\mathrm{get}(\mathrm{put}(v, c)) = (v, c)$ ### Property testing These laws are property-tested via the `check_laws` function, which: 1. Generates random source schemas using the protocol's theory. 2. Generates random source instances conforming to those schemas. 3. Constructs lenses from random combinator chains. 4. Verifies both GetPut and PutGet for each generated triple. The property tests run as part of the standard `cargo test` suite and also in the CI pipeline with extended iteration counts. ::: {.callout-caution} ## What if a law violation is found? The `check_laws` function returns a `LawViolation` struct with the source instance, the view, the complement, and which law failed. Reproduce the failure with the printed instance, then bisect through the `get`/`put` stages to find which stage introduced the discrepancy. Most violations trace to a missing `contraction_choices` entry or an incorrect `original_parent` mapping. ::: ## Cambria-style combinators Rather than constructing `Migration` objects and compiling them by hand, the combinator API lets you describe schema transformations as a chain of atomic operations. Each combinator represents a single, well-understood schema change. ::: {.callout-warning} The combinator semantics described below are implemented as elementary protolens constructors in `protolens::elementary` (e.g., `elementary::rename_sort`, `elementary::drop_sort`, `elementary::add_sort`). See @sec-elementary-protolens for the full list. ::: ```rust // Conceptual combinator enum (implemented via protolens::elementary) pub enum Combinator { RenameField { old: String, new: String }, AddField { name: String, kind: String, default: Value }, RemoveField { name: String }, WrapInObject { field_name: String }, HoistField { host: String, field: String }, CoerceType { from_kind: String, to_kind: String }, Compose(Box<Combinator>, Box<Combinator>), } ``` ### Combinator semantics Each combinator has a precise effect on the schema and a defined impact on the complement: | Combinator | Effect | Lossless? | Complement Impact | |---|---|---|---| | `RenameField` | Changes an edge label | Yes | Empty complement | | `AddField` | Introduces a new vertex with default | Yes | Default recorded | | `RemoveField` | Drops a vertex and incident edges | No | Dropped nodes/arcs stored | | `WrapInObject` | Inserts intermediate object vertex | No | Re-parenting tracked | | `HoistField` | Moves nested field to grandparent | No | Original topology tracked | | `CoerceType` | Changes kind of matching vertices | Depends | Kind mapping stored if lossy | | `Compose` | Sequential composition | Depends | Union of sub-complements | **`RenameField`** is the simplest combinator: it changes an edge's `name` field from `old` to `new`. No data is lost because the edge still connects the same vertices. The complement is empty. **`AddField`** introduces a new vertex (with a specified kind and default value) connected to the root by a new edge. During `get`, the new field appears in the view with its default value. During `put`, if the view's value differs from the default, that value is preserved. **`RemoveField`** drops a vertex and all edges incident on it. The complement stores the dropped node and its arcs. This is the canonical lossy combinator. **`WrapInObject`** creates a new intermediate object vertex and re-parents existing children of the root under it. The complement tracks the original parent-child relationships. **`HoistField`** moves a field from a nested object up to its grandparent. The complement tracks the original nesting so `put` can restore it. **`CoerceType`** changes the `kind` of all vertices matching `from_kind` to `to_kind`. Whether this is lossless depends on the kinds involved. `string` to `text` might be lossless; `float` to `integer` is not. **`Compose`** enables sequential composition of two combinators within the enum itself, providing a recursive structure for complex transformations. ### Naming combinators In addition to the six original combinators, seven naming combinators operate on the nine naming sites described in @sec-naming-identity-section: | Combinator | Site | Effect | |---|---|---| | `RenameVertex` | `VertexId` | Rename vertex ID, cascading to all edges, constraints, variants | | `RenameKind` | `VertexKind` | Change a single vertex's kind | | `RenameEdgeKind` | `EdgeKind` | Rename edge kind across all matching edges | | `RenameNsid` | `Nsid` | Change the NSID on a specific vertex | | `RenameConstraintSort` | `ConstraintSort` | Rename constraint sort across all constraints | | `ApplyTheoryMorphism` | Multiple | Apply theory-level sort/op renames (morphism tower) | | `Rename` | Any | Unified dispatcher for any `NameSite` | All naming combinators are lossless (empty complement). `RenameVertex` is the most complex: it cascades to edges, constraints, required sets, variants, recursion points, spans, hyper-edges, and nominal markers. `ApplyTheoryMorphism` uses `TheoryMorphism::induce_schema_renames()` to cascade sort renames to vertex kinds and op renames to edge kinds. In `build_compiled_migration`, `RenameVertex` and `Rename { site: VertexId, .. }` produce `vertex_remap` and `edge_remap` entries. All other naming combinators produce no structural migration impact; they change schema metadata only. ### Building a lens from combinators The `from_combinators` function takes a source schema and a slice of combinators, applies each in sequence, and produces a fully compiled `Lens`: 1. For each combinator, derive the next schema by applying the atomic transformation. 2. Build a compiled migration for that step. 3. Compose the step migration with the accumulated result. 4. Return a `Lens` containing the composed compiled migration, the original source schema, and the final target schema. You never need to manually construct vertex maps, edge maps, or resolvers when using combinators. The target schema and migration are derived automatically from the combinator chain. ::: {.callout-tip} Combinators are inspired by the [Cambria](https://www.inkandswitch.com/cambria/) lens language [@clarke2020]. panproto's combinators operate on the graph-level schema representation rather than on JSON paths, making them protocol-agnostic. The same combinator chain works for JSON Schema, Protobuf, SQL, or any other supported protocol. ::: ## Complement storage strategies In production deployments, the complement must be persisted alongside the view. Common strategies: - **Database column**: store the serialized complement as a JSONB or BLOB column alongside the view data. This is the simplest approach and works well when the complement is small relative to the view. - **Companion file**: for file-based storage (e.g., Parquet datasets), store the complement in a sidecar file with a `.complement.json` extension. - **Metadata header**: for protocols with metadata support (e.g., HTTP responses), encode the complement in a custom header or trailer. - **Separate table**: for relational databases, store complements in a dedicated table keyed by record ID and schema version. This avoids schema modifications to the main table. The `Complement` type derives `Serialize` and `Deserialize`, so any serde-compatible format works: JSON, MessagePack, CBOR, etc. ## Interaction with breaking change detection Lenses and breaking change detection (@sec-breaking-changes) are complementary: - A `RemoveField` combinator always produces a non-empty complement. The breaking change detector (`panproto-check`) will classify the corresponding schema change as breaking (vertex removal). - A `RenameField` combinator produces an empty complement. The breaking change detector will see the rename as a vertex removal plus vertex addition, classifying the removal as breaking. To avoid this, use the migration engine's rename support instead. - A `CoerceType` combinator may or may not be breaking depending on the kinds involved. The breaking change detector classifies all kind changes as breaking, which aligns with the conservative interpretation. In practice, you can use `panproto-check` to decide whether a lens is necessary: if the compatibility report says "fully compatible," a migration suffices. If it says "breaking," a lens with complement tracking is the appropriate tool. ::: {.callout-caution} ## When should you use `auto_generate` vs. manual combinator chains? Use `auto_generate` when the schema diff is simple (renames, drops, additions) and you want a quick lens without manual work. Use manual combinator chains when you need precise control over contraction resolvers, non-obvious coercions, or multi-step restructurings. `auto_generate` calls `diff_to_lens` internally, which may not handle complex restructurings the way you want. ::: ## Protolens modules {#sec-protolens-modules} The `panproto-lens` crate includes a protolens layer that lifts concrete lenses to schema-parameterized families. The protolens layer operates at the theory level, deriving lens families that can be instantiated against any schema conforming to the theory. ### `protolens.rs` **File**: `crates/panproto-lens/src/protolens.rs` Exports: - **`Protolens`**: a schema-parameterized lens. Given any schema that satisfies a precondition, it produces a concrete lens. Categorically, this is a natural transformation between theory endofunctors.^[See the tutorial's [Appendix A](https://panproto.dev/tutorial/appendices/A-formal-foundations.html).] - **`ProtolensChain`**: a sequence of protolenses for vertical composition. - **`ComplementConstructor`**: a type describing how the complement varies with the schema. - **`vertical_compose`** / **`horizontal_compose`**: composition operations. - **`elementary::*`**: nine atomic protolens constructors (`add_sort`, `drop_sort`, `rename_sort`, `add_op`, `drop_op`, `rename_op`, `add_equation`, `drop_equation`, `pullback`). The `Protolens::instantiate` method is the `complete` operation: given a schema and protocol, it applies the source and target transformations to compute source and target schemas, derives a `CompiledMigration` between them, and returns a concrete `Lens`.^[In dependent type theory, this is $\Pi$-elimination.] For full details, see @sec-protolens-engine. ### Relationship to cambria-style combinators The conceptual Cambria-style combinators described above (rename, add, remove, etc.) are implemented as elementary protolens constructors. Each combinator maps to a corresponding `protolens::elementary` function that operates at the theory level, producing a `Protolens` that can be instantiated against any schema satisfying its precondition. The `diff_to_protolens` module provides automated derivation of protolens chains from schema diffs. ## Optic classification {#sec-optic-classification} **File**: `crates/panproto-lens/src/optic.rs` Not every schema change is a lens. The `OpticKind` enum classifies protolens chains into five optic types, enabling complement optimization: | Kind | Complement | When | |---|---|---| | `Iso` | Unit (none needed) | All-rename chains, lossless coercions | | `Lens` | Dropped data | Any sort/op drops, additions with defaults | | `Prism` | Variant tag | Accessing a field in a union type | | `Affine` | (variant tag, dropped data) | Lens composed with Prism | | `Traversal` | Position list | Mapping over array elements | The `OpticKind::compose` method implements the standard optics lattice composition rules: `Iso` is the identity element, `Traversal` absorbs everything, and mixing `Lens` with `Prism` yields `Affine`. The `classify_transform` function maps each `TheoryTransform` to its optic kind. When a protolens chain is classified as `Iso`, the engine skips complement computation entirely. This is a significant optimization for the common case where schema changes are renames and lossless coercions. ## Symbolic simplification {#sec-symbolic-simplification} **File**: `crates/panproto-lens/src/symbolic.rs` Protolens chains composed from multiple schema migrations often contain redundant steps. The `simplify_steps` function applies algebraic rewrite rules in a fixpoint loop: 1. **Inverse cancellation**: `rename(A,B)` followed by `rename(B,A)` is removed. 2. **Rename fusion**: `rename(A,B)` followed by `rename(B,C)` becomes `rename(A,C)`. 3. **Add-drop cancellation**: `add(X)` followed by `drop(X)` is removed. A chain composed from three separate schema migrations might have 15 steps, many of which cancel or fuse. Symbolic simplification reduces this to the minimal equivalent chain before any data is touched. ## Module map | Module | Exports | Purpose | |---|---|---| | `asymmetric` | `Complement`, `get`, `put` | Core lens primitives with complement tracking | | `auto_lens` | `auto_generate`, `AutoLensConfig` | Automatic lens generation from schema pairs | | `compose` | `compose` | Composition of lenses for multi-step transformations | | `cost` | `complement_cost`, `chain_cost` | Lawvere metric cost computation | | `diff_to_protolens` | `diff_to_protolens`, `diff_to_lens` | Derive protolens chains from schema diffs | | `graph` | `LensGraph` | Weighted lens graph with Floyd-Warshall shortest paths | | `laws` | `check_laws` | Property-based testing of GetPut and PutGet laws | | `protolens` | `Protolens`, `ProtolensChain`, `ComplementConstructor`, `elementary::*` | Schema-parameterized lens families | | `symmetric` | `SymmetricLens` | Symmetric lens with bidirectional complements | | `error` | `LensError` | Error types for lens operations | The crate root re-exports `Lens`, `Complement`, `get`, `put`, `Protolens`, `ProtolensChain`, `ComplementConstructor`, `auto_generate`, and the composition functions. The `cost` and `graph` modules implement a Lawvere metric on protolens chains. Each `ComplementConstructor` has a numeric cost; `LensGraph` uses Floyd-Warshall to find the cheapest conversion path between any two schemas. For full details, see @sec-polynomial-operations. ## When to use lenses vs. migrations | Scenario | Use Migration | Use Lens | |---|---|---| | Rename a field | Yes (bijective) | Yes (but unnecessary overhead) | | Reorder fields | Yes (bijective) | Yes (but unnecessary overhead) | | Remove a field | No (not invertible) | Yes (complement stores removed data) | | Add a required field | Possible (with default) | Yes (complement stores default) | | Change a field's type | No (not invertible) | Yes (if coercion is defined) | | Wrap fields in a new object | No (not invertible) | Yes (complement tracks re-parenting) | | Multi-step evolution | Compose migrations | Chain combinators | As a rule of thumb: if `panproto-mig`'s `invert` would succeed on your migration, use a migration. If it would fail, use a lens. ## Performance considerations Lens operations are slightly more expensive than plain migrations because of the complement computation: - **`get`** performs one `wtype_restrict` call (same cost as `lift_wtype`) plus a set-difference pass over the source instance to compute the complement. The overhead is $O(|S|)$ where $S$ is the source node count. - **`put`** performs a reconstruction pass that is $O(|V| + |C|)$ where $V$ is view node count and $C$ is complement node count, with hash-map lookups for un-remapping and contraction choice disambiguation. - **Complement serialization** depends on the complement size, which is proportional to the amount of data dropped during `get`. For lossless transformations (empty complement), `get` and `put` have essentially the same cost as a migration lift and its inverse. In batch scenarios where the same lens is applied to many instances, construct the lens once and reuse it. The `Lens` struct is `Clone` and can be shared across threads via `Arc<Lens>`. The compiled migration inside the lens is immutable after construction, so concurrent `get` calls are safe without synchronization. The complement size is the primary cost driver. A lens that drops a single leaf field produces a small complement (one node, one arc). A lens that removes an entire subtree produces a complement proportional to the subtree size. When designing lens-based APIs, consider whether the complement can be compressed or whether a more targeted lens (removing only the specific fields needed) would reduce overhead. ## Batch mode and incremental mode The `panproto-lens` crate supports two modes of operation. **Batch mode** (the `Lens` struct with `get`/`put`) operates on whole instances: the forward direction projects an entire source instance to a view and complement, and the backward direction restores the source from a modified view and stored complement. This is the mode used by VCS data migration (`schema data migrate`), format conversion, and law verification. **Incremental mode** (the `EditLens` struct with `get_edit`/`put_edit`) operates on individual edits (`TreeEdit` values). Each edit flows through the lens, and the complement updates as a state machine. This mode is used by `schema data sync --edits` for live synchronization scenarios where re-migrating the entire data set on every edit would be impractical. Both modes share the same compiled migration, vertex/edge remap tables, and complement structure. `EditLens` wraps a `Lens` and adds the incremental translation pipeline. The edit lens laws (Consistency and Complement coherence) guarantee that the incremental mode agrees with the batch mode. For full details on the `EditLens` architecture, the five-step `EditPipeline`, complement state machine transitions, and law verification, see @sec-edit-lenses-internals. ## Error handling The `panproto-lens` error types cover failures in both the combinator and primitive layers: - **`LensError::Restrict`**: the underlying `wtype_restrict` call failed during `get`. This typically indicates a mismatch between the lens's schemas and the instance. - **`LensError::ComplementMismatch`**: during `put`, the complement is inconsistent with the view. This can happen if the complement was generated by a different lens or a different source instance. - **`LensError::FieldNotFound`**: a combinator references a field name not present in the schema. - **`LensError::VertexNotFound`**: a combinator references a vertex ID not present in the schema. - **`LensError::IncompatibleCoercion`**: a `CoerceType` combinator references a kind not present in any vertex.