10  panproto-lens: Bidirectional Lenses

The Complement struct captures everything that a forward projection discards. When get projects a source instance onto a target view, some data is inevitably lost: nodes without counterparts in the target schema, arcs connecting dropped nodes, fans whose children were pruned, and structural decisions made during ancestor contraction. The Complement records all of it.


/// The complement: data discarded by `get`, needed by `put` to restore the
/// original source instance.
///
/// When `get` projects a source instance to a target view, some nodes, arcs,
/// and structural decisions are lost. The complement records all of this so
/// that `put` can reconstruct the full source.
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct Complement {
    /// Nodes from the source that do not appear in the target view.
    pub dropped_nodes: HashMap<u32, Node>,
    /// Arcs from the source that do not appear in the target view.
    pub dropped_arcs: Vec<(u32, u32, Edge)>,
    /// Fans from the source whose parent or children were dropped during `get`.
    pub dropped_fans: Vec<Fan>,
    /// Resolver decisions made during ancestor contraction.
    pub contraction_choices: HashMap<(u32, u32), Edge>,
    /// Original parent mapping before contraction.
    pub original_parent: HashMap<u32, u32>,
    /// Fingerprint of the source schema at `get` time, used by `put` to
    /// validate that the complement matches the lens's source schema.
    #[serde(default)]
    pub source_fingerprint: u64,
    /// Pre-transform `extra_fields` for nodes that had `field_transforms` applied.
    /// Used by `put` to restore original field values.
    #[serde(default, skip_serializing_if = "HashMap::is_empty")]
    pub original_extra_fields: HashMap<u32, HashMap<String, panproto_inst::value::Value>>,
    /// Exact edge used for every arc in the view, keyed by `(parent_id, child_id)`.
    /// This makes `put` deterministic when the source schema has parallel edges
    /// between the same vertex pair, ensuring the cartesian lift is unique.
    #[serde(default, skip_serializing_if = "HashMap::is_empty")]

Each field records a specific category of lost information:

A complement is empty when the transformation is lossless (e.g., a pure rename). The is_empty method provides a quick check. When the complement is non-empty, it must be stored alongside the view so that put can later restore the original.

Tip

In production systems, the complement is typically stored alongside the view in a metadata sidecar: a separate column in a database, a companion file, or an opaque blob attached to the record. The complement is serializable via serde, so it can be stored in any format that supports Serialize/Deserialize.

10.1 Forward direction: get

Formally, \(\mathrm{get}: S \to V \times C\) where \(S\) is the source instance type, \(V\) is the view type, and \(C\) is the complement type. The function runs the forward (restrict) pipeline, producing both the projected view and its complement.

    /// Pre-coercion `node.value` for nodes that had `__value__` field transforms applied.
    /// Used by `put()` to restore the original leaf value.
    #[serde(default, skip_serializing_if = "HashMap::is_empty")]
    pub original_values: HashMap<u32, Option<panproto_inst::value::FieldPresence>>,
}

impl Complement {
    /// Create an empty complement (no data discarded).
    #[must_use]
    pub fn empty() -> Self {
        Self {

Internally, get delegates to wtype_restrict from panproto-inst to compute the view, then calculates the set difference between the source and result to populate each complement field:

  1. View computation: calls wtype_restrict with the lens’s compiled migration. This produces the projected instance under the target schema.
  2. Dropped nodes: iterates over source nodes, collecting any not present in the view.
  3. Dropped arcs: iterates over source arcs, collecting any where either endpoint was dropped.
  4. Contraction choices: for each arc in the view that connects two nodes not directly connected in the source, records the resolver edge that was used.
  5. Original parents: for each surviving node in the view, records what its parent was in the source instance (before contraction may have changed the parent).
  6. Dropped fans: iterates over source fans, collecting any where the parent or a child was dropped.

The resulting (WInstance, Complement) pair contains all the information from the original source, just partitioned into “visible” (the view) and “invisible” (the complement).

10.2 Backward direction: put

Formally, \(\mathrm{put}: V \times C \to S\). The function restores a source instance from a (possibly modified) view and the stored complement. The reconstruction proceeds in stages:

  1. Copy view nodes: all nodes from the view are copied, with their anchors un-remapped from target vertex IDs back to source vertex IDs using the inverse of the compiled migration’s vertex remap.
  2. Re-insert dropped nodes: every node recorded in the complement’s dropped_nodes is inserted back into the node set.
  3. Rebuild arcs from original parents: for each surviving node, the original_parent mapping tells us who its parent was in the source. The function finds the appropriate edge (consulting contraction_choices for disambiguation) and creates an arc.
  4. Re-insert dropped arcs: arcs from the complement are added back, provided both endpoints are present in the restored node set.
  5. Reconstruct fans: view fans are un-remapped; dropped fans whose nodes are all present are re-inserted.

The result is a full source instance that incorporates any edits made to the view while preserving the structural elements that were invisible to the view.

Note

If the view was modified between get and put (e.g., a field value was changed by an API consumer), those modifications are preserved in the restored instance. The complement provides the structure that the view can’t represent; the view provides the content that may have been updated.

10.3 The get/put round-trip

flowchart LR
    S[Source Instance] -->|get| V[View]
    S -->|get| C[Complement]
    V -->|put| R[Restored Instance]
    C -->|put| R
    R ---|should equal| S

The complement acts as a side channel, carrying exactly the information that the view can’t represent. Together, the view and complement contain all the information in the source, just partitioned differently.

Consider a concrete example. Suppose a source schema has fields {name, email, age} and the target schema has only {name, email}. The get direction produces a view with {name, email} and a complement containing the age node, its arc from the parent, and any constraints. If an API consumer updates email in the view, calling put with the modified view and the original complement produces a source instance with the updated email, the original name, and the original age. Nothing is lost.

10.4 Lens laws

panproto lenses satisfy two laws. For the tutorial-level explanation, see the tutorial. We focus here on the implementation of the property tests.

GetPut law: \(\mathrm{put}(\mathrm{get}(s).\mathrm{view},\; \mathrm{get}(s).\mathrm{complement}) = s\)

PutGet law: \(\mathrm{get}(\mathrm{put}(v, c)) = (v, c)\)

10.4.1 Property testing

These laws are property-tested via the check_laws function, which:

  1. Generates random source schemas using the protocol’s theory.
  2. Generates random source instances conforming to those schemas.
  3. Constructs lenses from random combinator chains.
  4. Verifies both GetPut and PutGet for each generated triple.

The property tests run as part of the standard cargo test suite and also in the CI pipeline with extended iteration counts.

CautionWhat if a law violation is found?

The check_laws function returns a LawViolation struct with the source instance, the view, the complement, and which law failed. Reproduce the failure with the printed instance, then bisect through the get/put stages to find which stage introduced the discrepancy. Most violations trace to a missing contraction_choices entry or an incorrect original_parent mapping.

10.5 Cambria-style combinators

Rather than constructing Migration objects and compiling them by hand, the combinator API lets you describe schema transformations as a chain of atomic operations. Each combinator represents a single, well-understood schema change.

Warning

The combinator semantics described below are implemented as elementary protolens constructors in protolens::elementary (e.g., elementary::rename_sort, elementary::drop_sort, elementary::add_sort). See Section 19.7 for the full list.

// Conceptual combinator enum (implemented via protolens::elementary)
pub enum Combinator {
    RenameField { old: String, new: String },
    AddField { name: String, kind: String, default: Value },
    RemoveField { name: String },
    WrapInObject { field_name: String },
    HoistField { host: String, field: String },
    CoerceType { from_kind: String, to_kind: String },
    Compose(Box<Combinator>, Box<Combinator>),
}

10.5.1 Combinator semantics

Each combinator has a precise effect on the schema and a defined impact on the complement:

Combinator Effect Lossless? Complement Impact
RenameField Changes an edge label Yes Empty complement
AddField Introduces a new vertex with default Yes Default recorded
RemoveField Drops a vertex and incident edges No Dropped nodes/arcs stored
WrapInObject Inserts intermediate object vertex No Re-parenting tracked
HoistField Moves nested field to grandparent No Original topology tracked
CoerceType Changes kind of matching vertices Depends Kind mapping stored if lossy
Compose Sequential composition Depends Union of sub-complements

RenameField is the simplest combinator: it changes an edge’s name field from old to new. No data is lost because the edge still connects the same vertices. The complement is empty.

AddField introduces a new vertex (with a specified kind and default value) connected to the root by a new edge. During get, the new field appears in the view with its default value. During put, if the view’s value differs from the default, that value is preserved.

RemoveField drops a vertex and all edges incident on it. The complement stores the dropped node and its arcs. This is the canonical lossy combinator.

WrapInObject creates a new intermediate object vertex and re-parents existing children of the root under it. The complement tracks the original parent-child relationships.

HoistField moves a field from a nested object up to its grandparent. The complement tracks the original nesting so put can restore it.

CoerceType changes the kind of all vertices matching from_kind to to_kind. Whether this is lossless depends on the kinds involved. string to text might be lossless; float to integer is not.

Compose enables sequential composition of two combinators within the enum itself, providing a recursive structure for complex transformations.

10.5.2 Naming combinators

In addition to the six original combinators, seven naming combinators operate on the nine naming sites described in Section 18.2:

Combinator Site Effect
RenameVertex VertexId Rename vertex ID, cascading to all edges, constraints, variants
RenameKind VertexKind Change a single vertex’s kind
RenameEdgeKind EdgeKind Rename edge kind across all matching edges
RenameNsid Nsid Change the NSID on a specific vertex
RenameConstraintSort ConstraintSort Rename constraint sort across all constraints
ApplyTheoryMorphism Multiple Apply theory-level sort/op renames (morphism tower)
Rename Any Unified dispatcher for any NameSite

All naming combinators are lossless (empty complement). RenameVertex is the most complex: it cascades to edges, constraints, required sets, variants, recursion points, spans, hyper-edges, and nominal markers. ApplyTheoryMorphism uses TheoryMorphism::induce_schema_renames() to cascade sort renames to vertex kinds and op renames to edge kinds.

In build_compiled_migration, RenameVertex and Rename { site: VertexId, .. } produce vertex_remap and edge_remap entries. All other naming combinators produce no structural migration impact; they change schema metadata only.

10.5.3 Building a lens from combinators

The from_combinators function takes a source schema and a slice of combinators, applies each in sequence, and produces a fully compiled Lens:

  1. For each combinator, derive the next schema by applying the atomic transformation.
  2. Build a compiled migration for that step.
  3. Compose the step migration with the accumulated result.
  4. Return a Lens containing the composed compiled migration, the original source schema, and the final target schema.

You never need to manually construct vertex maps, edge maps, or resolvers when using combinators. The target schema and migration are derived automatically from the combinator chain.

Tip

Combinators are inspired by the Cambria lens language (Clarke 2020). panproto’s combinators operate on the graph-level schema representation rather than on JSON paths, making them protocol-agnostic. The same combinator chain works for JSON Schema, Protobuf, SQL, or any other supported protocol.

10.6 Complement storage strategies

In production deployments, the complement must be persisted alongside the view. Common strategies:

  • Database column: store the serialized complement as a JSONB or BLOB column alongside the view data. This is the simplest approach and works well when the complement is small relative to the view.
  • Companion file: for file-based storage (e.g., Parquet datasets), store the complement in a sidecar file with a .complement.json extension.
  • Metadata header: for protocols with metadata support (e.g., HTTP responses), encode the complement in a custom header or trailer.
  • Separate table: for relational databases, store complements in a dedicated table keyed by record ID and schema version. This avoids schema modifications to the main table.

The Complement type derives Serialize and Deserialize, so any serde-compatible format works: JSON, MessagePack, CBOR, etc.

10.7 Interaction with breaking change detection

Lenses and breaking change detection (Chapter 11) are complementary:

  • A RemoveField combinator always produces a non-empty complement. The breaking change detector (panproto-check) will classify the corresponding schema change as breaking (vertex removal).
  • A RenameField combinator produces an empty complement. The breaking change detector will see the rename as a vertex removal plus vertex addition, classifying the removal as breaking. To avoid this, use the migration engine’s rename support instead.
  • A CoerceType combinator may or may not be breaking depending on the kinds involved. The breaking change detector classifies all kind changes as breaking, which aligns with the conservative interpretation.

In practice, you can use panproto-check to decide whether a lens is necessary: if the compatibility report says “fully compatible,” a migration suffices. If it says “breaking,” a lens with complement tracking is the appropriate tool.

CautionWhen should you use auto_generate vs. manual combinator chains?

Use auto_generate when the schema diff is simple (renames, drops, additions) and you want a quick lens without manual work. Use manual combinator chains when you need precise control over contraction resolvers, non-obvious coercions, or multi-step restructurings. auto_generate calls diff_to_lens internally, which may not handle complex restructurings the way you want.

10.8 Protolens modules

The panproto-lens crate includes a protolens layer that lifts concrete lenses to schema-parameterized families. The protolens layer operates at the theory level, deriving lens families that can be instantiated against any schema conforming to the theory.

10.8.1 protolens.rs

File: crates/panproto-lens/src/protolens.rs

Exports:

  • Protolens: a schema-parameterized lens. Given any schema that satisfies a precondition, it produces a concrete lens. Categorically, this is a natural transformation between theory endofunctors.1
  • ProtolensChain: a sequence of protolenses for vertical composition.
  • ComplementConstructor: a type describing how the complement varies with the schema.
  • vertical_compose / horizontal_compose: composition operations.
  • elementary::*: nine atomic protolens constructors (add_sort, drop_sort, rename_sort, add_op, drop_op, rename_op, add_equation, drop_equation, pullback).

The Protolens::instantiate method is the complete operation: given a schema and protocol, it applies the source and target transformations to compute source and target schemas, derives a CompiledMigration between them, and returns a concrete Lens.2

For full details, see Chapter 19.

10.8.2 Relationship to cambria-style combinators

The conceptual Cambria-style combinators described above (rename, add, remove, etc.) are implemented as elementary protolens constructors. Each combinator maps to a corresponding protolens::elementary function that operates at the theory level, producing a Protolens that can be instantiated against any schema satisfying its precondition. The diff_to_protolens module provides automated derivation of protolens chains from schema diffs.

10.9 Optic classification

File: crates/panproto-lens/src/optic.rs

Not every schema change is a lens. The OpticKind enum classifies protolens chains into five optic types, enabling complement optimization:

Kind Complement When
Iso Unit (none needed) All-rename chains, lossless coercions
Lens Dropped data Any sort/op drops, additions with defaults
Prism Variant tag Accessing a field in a union type
Affine (variant tag, dropped data) Lens composed with Prism
Traversal Position list Mapping over array elements

The OpticKind::compose method implements the standard optics lattice composition rules: Iso is the identity element, Traversal absorbs everything, and mixing Lens with Prism yields Affine.

The classify_transform function maps each TheoryTransform to its optic kind. When a protolens chain is classified as Iso, the engine skips complement computation entirely. This is a significant optimization for the common case where schema changes are renames and lossless coercions.

10.10 Symbolic simplification

File: crates/panproto-lens/src/symbolic.rs

Protolens chains composed from multiple schema migrations often contain redundant steps. The simplify_steps function applies algebraic rewrite rules in a fixpoint loop:

  1. Inverse cancellation: rename(A,B) followed by rename(B,A) is removed.
  2. Rename fusion: rename(A,B) followed by rename(B,C) becomes rename(A,C).
  3. Add-drop cancellation: add(X) followed by drop(X) is removed.

A chain composed from three separate schema migrations might have 15 steps, many of which cancel or fuse. Symbolic simplification reduces this to the minimal equivalent chain before any data is touched.

10.11 Module map

Module Exports Purpose
asymmetric Complement, get, put Core lens primitives with complement tracking
auto_lens auto_generate, AutoLensConfig Automatic lens generation from schema pairs
compose compose Composition of lenses for multi-step transformations
cost complement_cost, chain_cost Lawvere metric cost computation
diff_to_protolens diff_to_protolens, diff_to_lens Derive protolens chains from schema diffs
graph LensGraph Weighted lens graph with Floyd-Warshall shortest paths
laws check_laws Property-based testing of GetPut and PutGet laws
protolens Protolens, ProtolensChain, ComplementConstructor, elementary::* Schema-parameterized lens families
symmetric SymmetricLens Symmetric lens with bidirectional complements
error LensError Error types for lens operations

The crate root re-exports Lens, Complement, get, put, Protolens, ProtolensChain, ComplementConstructor, auto_generate, and the composition functions.

The cost and graph modules implement a Lawvere metric on protolens chains. Each ComplementConstructor has a numeric cost; LensGraph uses Floyd-Warshall to find the cheapest conversion path between any two schemas. For full details, see Chapter 25.

10.12 When to use lenses vs. migrations

Scenario Use Migration Use Lens
Rename a field Yes (bijective) Yes (but unnecessary overhead)
Reorder fields Yes (bijective) Yes (but unnecessary overhead)
Remove a field No (not invertible) Yes (complement stores removed data)
Add a required field Possible (with default) Yes (complement stores default)
Change a field’s type No (not invertible) Yes (if coercion is defined)
Wrap fields in a new object No (not invertible) Yes (complement tracks re-parenting)
Multi-step evolution Compose migrations Chain combinators

As a rule of thumb: if panproto-mig’s invert would succeed on your migration, use a migration. If it would fail, use a lens.

10.13 Performance considerations

Lens operations are slightly more expensive than plain migrations because of the complement computation:

  • get performs one wtype_restrict call (same cost as lift_wtype) plus a set-difference pass over the source instance to compute the complement. The overhead is \(O(|S|)\) where \(S\) is the source node count.
  • put performs a reconstruction pass that is \(O(|V| + |C|)\) where \(V\) is view node count and \(C\) is complement node count, with hash-map lookups for un-remapping and contraction choice disambiguation.
  • Complement serialization depends on the complement size, which is proportional to the amount of data dropped during get.

For lossless transformations (empty complement), get and put have essentially the same cost as a migration lift and its inverse.

In batch scenarios where the same lens is applied to many instances, construct the lens once and reuse it. The Lens struct is Clone and can be shared across threads via Arc<Lens>. The compiled migration inside the lens is immutable after construction, so concurrent get calls are safe without synchronization.

The complement size is the primary cost driver. A lens that drops a single leaf field produces a small complement (one node, one arc). A lens that removes an entire subtree produces a complement proportional to the subtree size. When designing lens-based APIs, consider whether the complement can be compressed or whether a more targeted lens (removing only the specific fields needed) would reduce overhead.

10.14 Batch mode and incremental mode

The panproto-lens crate supports two modes of operation. Batch mode (the Lens struct with get/put) operates on whole instances: the forward direction projects an entire source instance to a view and complement, and the backward direction restores the source from a modified view and stored complement. This is the mode used by VCS data migration (schema data migrate), format conversion, and law verification.

Incremental mode (the EditLens struct with get_edit/put_edit) operates on individual edits (TreeEdit values). Each edit flows through the lens, and the complement updates as a state machine. This mode is used by schema data sync --edits for live synchronization scenarios where re-migrating the entire data set on every edit would be impractical.

Both modes share the same compiled migration, vertex/edge remap tables, and complement structure. EditLens wraps a Lens and adds the incremental translation pipeline. The edit lens laws (Consistency and Complement coherence) guarantee that the incremental mode agrees with the batch mode.

For full details on the EditLens architecture, the five-step EditPipeline, complement state machine transitions, and law verification, see Chapter 26.

10.15 Error handling

The panproto-lens error types cover failures in both the combinator and primitive layers:

  • LensError::Restrict: the underlying wtype_restrict call failed during get. This typically indicates a mismatch between the lens’s schemas and the instance.
  • LensError::ComplementMismatch: during put, the complement is inconsistent with the view. This can happen if the complement was generated by a different lens or a different source instance.
  • LensError::FieldNotFound: a combinator references a field name not present in the schema.
  • LensError::VertexNotFound: a combinator references a vertex ID not present in the schema.
  • LensError::IncompatibleCoercion: a CoerceType combinator references a kind not present in any vertex.
Clarke, Bryce. 2020. “Internal Lenses as Functors and Cofunctors.” Applied Category Theory 2019, Electronic proceedings in theoretical computer science, vol. 323: 183–95. https://doi.org/10.4204/EPTCS.323.13.

  1. See the tutorial’s Appendix A.↩︎

  2. In dependent type theory, this is \(\Pi\)-elimination.↩︎