18 Symmetric Lenses and Schema Merging
Sometimes neither schema is a “view” of the other. An ATProto post and a Mastodon status represent the same concept through different lenses. Two teams fork a schema and evolve independently—both versions are equally valid. A CRM and an ERP system both have “customer” records, but neither is canonical. In these cases, asymmetric lenses from Chapter 7 don’t apply. You need a bidirectional transformation where both sides are peers.
The solution uses an overlap: the largest structure both schemas share. You build a symmetric lens as a span of asymmetric lenses: \(S_1 \xleftarrow{\ell_1} O \xrightarrow{\ell_2} S_2\), where \(O\) is the overlap and \(\ell_1\), \(\ell_2\) are the two legs. Data translates through the overlap, with each side’s unique fields preserved in complements.
18.1 The overlap construction
The overlap schema \(O\) captures the largest common structure between \(S_1\) and \(S_2\). The left leg \(\ell_1\) is an asymmetric lens from \(O\) to \(S_1\). The right leg \(\ell_2\) is an asymmetric lens from \(O\) to \(S_2\).
To translate from \(S_1\) to \(S_2\):
getalong \(\ell_1\): project \(S_1\) data down to the overlap, capturing complement \(c_1\) (everything \(S_1\) has that the overlap does not).putalong \(\ell_2\): lift overlap data up to \(S_2\), using complement \(c_2\) (everything \(S_2\) needs that the overlap does not provide).
To translate back from \(S_2\) to \(S_1\):
getalong \(\ell_2\): project \(S_2\) data down to the overlap, capturing complement \(c_2'\).putalong \(\ell_1\): lift overlap data up to \(S_1\), using complement \(c_1\).
The complements \(c_1\) and \(c_2\) are the “private state” of each side. They persist across sync rounds, accumulating the side-specific data that cannot be translated.
18.2 Building symmetric lenses from protolenses
panproto combines protolenses (Chapter 16) with the span construction, making cross-schema synchronization as automatic as within-schema migration.
The algorithm:
- Discover overlap. Find the largest shared sub-schema \(O\).
- Generate left protolens chain. Auto-generate a protolens chain from \(O\) to \(S_1\) (the additions and restructurings that \(S_1\) adds beyond the overlap).
- Generate right protolens chain. Auto-generate a protolens chain from \(O\) to \(S_2\).
- Instantiate both legs. Produce concrete asymmetric lenses \(\ell_1\) and \(\ell_2\).
- Combine into symmetric lens. Package the two legs and the overlap into a
SymmetricLensHandle.
import { SymmetricLensHandle } from "@panproto/core";
// One call does steps 1-5
const symLens = SymmetricLensHandle.fromSchemas(schemaA, schemaB);
// Translate A -> B
const { data: bData, complement: cA } = symLens.forward(aData);
// Translate B -> A
const { data: aData, complement: cB } = symLens.backward(bData);
// Round-trip: A -> B -> A (using stored complement)
const { data: bModified } = symLens.forward(aData);
// ... modify bModified ...
const { data: aRestored } = symLens.backward(bModified, { complement: cA });In Python:
from panproto import SymmetricLensHandle
# One call does steps 1-5
sym_lens = SymmetricLensHandle.from_schemas(schema_a, schema_b)
# Translate A -> B
b_data, c_a = sym_lens.forward(a_data)
# Translate B -> A
a_data, c_b = sym_lens.backward(b_data)
# Round-trip with complement
b_modified, _ = sym_lens.forward(a_data)
# ... modify b_modified ...
a_restored, _ = sym_lens.backward(b_modified, complement=c_a)The quality of the symmetric lens depends entirely on the quality of the overlap. A small overlap means large complements and more data that cannot be translated. A large overlap means small complements and better interoperability. Inspect the overlap before committing:
const overlap = symLens.overlap;
console.log(`Shared: ${overlap.vertices.length} vertices`);
console.log(`A-only: ${symLens.leftComplement.length} vertices`);
console.log(`B-only: ${symLens.rightComplement.length} vertices`);If two schemas share only a single vertex, the resulting lens is technically valid but nearly useless: almost all data lives in the complements. What’s the minimum overlap size for a meaningful lens?
Technically, any non-empty overlap produces a valid symmetric lens. But a single shared vertex means almost all data lives in the complements, and translation amounts to discarding most of one side and fabricating most of the other from defaults. In practice, the overlap should cover the fields both sides actually exchange. A useful heuristic: if the overlap contains less than a third of either schema’s vertices, the lens will produce mostly default-filled translations, and a manual mapping may be more appropriate.
18.3 Worked example: synchronizing diverged schemas
Two teams start from the same user schema and evolve independently for three months.
18.3.1 Team A’s schema
{
"vertices": ["user", "handle", "email", "createdAt", "role"],
"edges": [
{ "from": "user", "to": "handle", "kind": "prop" },
{ "from": "user", "to": "email", "kind": "prop" },
{ "from": "user", "to": "createdAt", "kind": "prop" },
{ "from": "user", "to": "role", "kind": "prop" }
]
}Team A added role (admin, user, moderator).
18.3.2 Team B’s schema
{
"vertices": ["user", "handle", "emailAddress", "created_at", "avatarUrl"],
"edges": [
{ "from": "user", "to": "handle", "kind": "prop" },
{ "from": "user", "to": "emailAddress", "kind": "prop" },
{ "from": "user", "to": "created_at", "kind": "prop" },
{ "from": "user", "to": "avatarUrl", "kind": "prop" }
]
}Team B renamed email to emailAddress, createdAt to created_at, and added avatarUrl.
18.3.3 Building the symmetric lens
const symLens = SymmetricLensHandle.fromSchemas(teamASchema, teamBSchema);
console.log(symLens.overlap);
// Overlap: { vertices: ["user", "handle"], edges: [user->handle] }
// Renames detected: email<->emailAddress, createdAt<->created_at
console.log(symLens.leftOnly); // ["role"]
console.log(symLens.rightOnly); // ["avatarUrl"]panproto detects the renames (email / emailAddress and createdAt / created_at) via morphism discovery scoring. These are included in the overlap with rename protolenses in each leg.
18.3.4 Synchronizing data
const teamAUser = {
handle: "alice",
email: "alice@example.com",
createdAt: "2024-06-01",
role: "admin",
};
// Translate to Team B's schema
const { data: teamBUser, complement } = symLens.forward(teamAUser);
// teamBUser = {
// handle: "alice",
// emailAddress: "alice@example.com",
// created_at: "2024-06-01",
// avatarUrl: null, // default: Team A doesn't have this
// }
// complement stores: { role: "admin" }
// Team B modifies the data
teamBUser.avatarUrl = "https://example.com/alice.png";
teamBUser.emailAddress = "alice@newdomain.com";
// Translate back to Team A's schema
const { data: syncedA } = symLens.backward(teamBUser, { complement });
// syncedA = {
// handle: "alice",
// email: "alice@newdomain.com", // updated!
// createdAt: "2024-06-01",
// role: "admin", // restored from complement
// }The email change propagated through the symmetric lens (via the overlap). The role field survived the round-trip via the complement. The avatarUrl was lost going back to Team A’s schema (it lives in Team B’s complement, not Team A’s).
18.3.5 Merging the schemas
If the teams decide to reunify, the merged schema gives them the combined result.1
const mergedSchema = symLens.mergedSchema;
// vertices: ["user", "handle", "email", "createdAt", "role", "avatarUrl"]
// (renames resolved: Team A's names used as canonical, with aliases)The merged schema is the smallest schema containing everything from both sides, with the overlap identified. Any schema that both \(S_1\) and \(S_2\) map into factors through the merged schema.2
18.4 Multi-schema documents
Some applications need a document to satisfy multiple schemas simultaneously. A social media post might need to be valid ATProto and valid ActivityPub. A data record might conform to both the CRM schema and the ERP schema.
Symmetric lenses enable multi-schema documents through the overlap:
// Create a document that satisfies both schemas
const symLens = SymmetricLensHandle.fromSchemas(atprotoSchema, activityPubSchema);
// Start from ATProto data
const atprotoPost = { text: "Hello", createdAt: "2025-01-01T00:00:00Z", likeCount: 42 };
// Get the overlap view (what both schemas agree on)
const { data: overlapData } = symLens.toOverlap(atprotoPost, "left");
// Build both representations from the overlap
const { data: atprotoView } = symLens.fromOverlap(overlapData, "left", {
defaults: { likeCount: 0 },
});
const { data: activityPubView } = symLens.fromOverlap(overlapData, "right", {
defaults: { published: new Date().toISOString() },
});The overlap acts as a lingua franca: a minimal representation that both schemas can project to and reconstruct from.
18.5 Conflict detection and three-way merge
When both sides of a symmetric lens are modified independently, conflicts arise. Two users edit the same post (one via ATProto, one via ActivityPub) and both change the text field.
panproto’s symmetric lenses handle conflicts through a three-way merge rooted at the overlap:
- Project both sides to the overlap. Get \(O_1\) from modified \(S_1\) data and \(O_2\) from modified \(S_2\) data.
- Diff against the base. Compare \(O_1\) and \(O_2\) against the last-synced overlap state \(O_{\text{base}}\).
- Merge. For each field in the overlap:
- If only one side changed it, take that change.
- If both sides changed it to the same value, take either (they agree).
- If both sides changed it to different values, report a conflict.
const result = symLens.sync(modifiedA, modifiedB, { base: lastSyncedOverlap });
if (result.conflicts.length > 0) {
console.log("Conflicts detected:");
for (const c of result.conflicts) {
console.log(` ${c.field}: "${c.leftValue}" vs "${c.rightValue}"`);
}
// Resolve manually
result.resolve(c.field, c.leftValue); // or c.rightValue
}
const { left: syncedA, right: syncedB } = result.apply();Fields outside the overlap are never in conflict: they belong exclusively to one side and are preserved in that side’s complement.
panproto detects conflicts at the schema level: two different values for the same overlapping field. It doesn’t resolve semantic conflicts (e.g., “is this rename intentional?”). Semantic resolution is an application-level concern. panproto gives you the machinery to detect and surface conflicts; your application decides what to do.
If Team A removes a vertex that Team B renamed, neither change touches the overlap directly, yet the resulting schemas are incompatible. How does panproto detect these “structural divergence” conflicts?
panproto detects structural divergence by comparing the two schemas against their last common ancestor (the merge base in the VCS DAG). If Team A removed a vertex that Team B renamed, the three-way diff from the merge base shows both a deletion and a rename on the same element. This is a DeleteModifyVertex conflict, reported via the same typed conflict system used by schema merge. These conflicts live in the schema structure itself, not the overlap, and are caught before symmetric lens construction begins.
18.6 Lifting across protocols
A protolens defined for one protocol can be lifted to another protocol via a theory morphism. If you have a renameVertex("author", "creator") protolens for ATProto schemas and a theory morphism from ATProto to SQL, liftProtolens produces an equivalent protolens that operates on SQL schemas.
import { liftProtolens, liftChain } from "@panproto/core";
// A protolens that renames Vertex -> Node in ATProto
const atprotoRename = Protolens.renameVertex("Vertex", "Node");
// A theory morphism from ATProto to SQL
const morphism = getTheoryMorphism("atproto", "sql");
// Lift the protolens to SQL
const sqlRename = liftProtolens(atprotoRename, morphism);
// sqlRename renames the SQL-side image of "Vertex" to the SQL-side image of "Node"Lifting works on entire chains:
const atprotoChain = ProtolensChainHandle.autoGenerate(oldAtproto, newAtproto);
const sqlChain = liftChain(atprotoChain, morphism);
const sqlLens = sqlChain.instantiate(sqlSchema);from panproto import lift_protolens, lift_chain
atproto_chain = ProtolensChainHandle.auto_generate(old_atproto, new_atproto)
sql_chain = lift_chain(atproto_chain, morphism)
sql_lens = sql_chain.instantiate(sql_schema)The lift composes the protolens’s endofunctor transforms with a Pullback(morphism) step: sort and operation references in preconditions are renamed through the morphism’s sort and op maps. Complement constructors are preserved; a DataCaptured complement remains DataCaptured after lifting.
From the CLI:
schema lens-lift chain.json atproto-to-sql.json --jsonWhen you lift a protolens chain from ATProto to SQL, the complements are “preserved.” But the SQL-side schema may have different cardinalities for the affected sorts. Can the lifted chain produce larger (or smaller) complements than the original?
The structural transformation is preserved, but the cardinalities can differ. If an ATProto schema has 3 edges on the dropped sort and the corresponding SQL schema (after theory morphism application) has 7 columns mapping to that sort, the lifted chain captures 7 columns instead of 3 edges. The complement structure (a DataCaptured entry) is the same, but the complement size scales with the target schema’s cardinality. Lifting preserves the transformation’s semantics, not its byte count.
The merged schema is the pushout \(S_1 +_O S_2\) in the categorical sense: the smallest schema containing both \(S_1\) and \(S_2\), with the overlap identified. See Appendix A.↩︎
This universal property makes the merged schema canonical rather than ad hoc.↩︎