22 Value-Dependent Migration

You encounter a heading element: it carries the level as a vertex name (h1, h2, h3). Later, you want a single heading vertex with a level attribute. The structural transform to rename is trivial. But where does the level attribute come from? It comes from what the data says—the old vertex name itself. None of panproto’s structural schema operations can express that relationship.

FieldTransform and conditional_survival form the value-level layer. They operate on the node’s field environment at runtime, reading what the data contains and transforming fields accordingly.

22.1 Why structural transforms are not enough

Consider the heading problem in detail. Your source schema has separate vertices for each level: h1, h2, h3. You want to consolidate them into a single heading vertex with an integer level attribute.

The structural part is easy—rename all three to heading. But now every node reads as heading with no level. The missing piece is value-dependent: you need the original vertex name to compute the level value.

Here is a second pattern. Document formats sometimes encode semantic roles in CSS classes. An a element with class: "u-url" is a microformat URL link; one with class: "u-email" is for email. To normalize to a single link vertex type, you must keep only elements whose class contains a recognized token. Whether a node survives depends on what its class field actually holds.

FieldTransform handles the first case (computed fields based on data values). conditional_survival handles the second (dropping nodes that don’t match a value-based predicate).

22.2 PathTransform: navigating nested structures

A node’s extra_fields is a flat map. But the values themselves can be maps (nested objects), forming a tree. Some formats store attributes in nested objects: { "attrs": { "level": 2, "style": "bold" } }.

PathTransform applies any field transform at a specified path within this tree:

// rename "old_attr" -> "new_attr" inside the "attrs" nested object
migration.add_path_transform(
    "heading",
    &["attrs"],
    FieldTransform::RenameField {
        old_key: "old_attr".to_owned(),
        new_key: "new_attr".to_owned(),
    },
);

The path is a sequence of keys to navigate. &["attrs"] means: find the attrs field, treat it as a map, and apply the inner transform to that map’s fields. Deeper paths are supported: &["attrs", "styles"] navigates two levels.

An empty path applies the inner transform directly to the node’s own extra_fields.

PathTransform composes with all other variants. You can nest a Case inside a PathTransform for conditional logic at nested levels, or nest a ComputeField inside PathTransform to compute values using both top-level fields and the nested subtree.

22.3 MapReferences: updating vertex-referencing strings

When you rename a vertex, any fields that carry that vertex name as a string go stale. Parent reference arrays in tree-structured formats ("parents": ["root", "body"]) are the canonical example: rename "body" to "main", and those parent reference strings need to follow.

MapReferences applies a rename map to a string field:

let mut rename_map = HashMap::new();
rename_map.insert("body".to_owned(), Some("main".to_owned()));
rename_map.insert("sidebar".to_owned(), None); // drop references to removed vertex

migration.add_map_references("paragraph", "parents", rename_map);

For each paragraph node, the parents field is updated:

Value::Str("body") becomes Value::Str("main").
Value::Str("sidebar") is removed (mapped to None).
String values not in the rename map pass through unchanged.

The transform handles both flat strings and encoded arrays. panproto encodes arrays in extra_fields as Value::Unknown maps with an __array_len sentinel. MapReferences detects this and iterates over array elements, renaming or dropping each one. References mapped to None are removed; the array is compacted and its length updated.

This is the functorial action of vertex rename on the name-reference algebra: every structural rename should be accompanied by MapReferences on any field that carries those names.

22.4 ComputeField: expression-based field computation

ComputeField evaluates an expression against a node’s full field environment and stores the result in a target field. Unlike ApplyExpr (which binds a single field’s value), ComputeField makes all extra_fields available as variables.

The heading level example:

use panproto_expr::{Expr, Literal, BuiltinOp};
use std::sync::Arc;

// compute: (concat "h" (int_to_str level))
let expr = Expr::Builtin(
    BuiltinOp::Concat,
    vec![
        Expr::Lit(Literal::Str("h".to_owned())),
        Expr::Builtin(
            BuiltinOp::IntToStr,
            vec![Expr::Var(Arc::from("level"))],
        ),
    ],
);

migration.add_computed_field("heading", "name", expr);

When a heading node with level: 2 is processed, this expression evaluates to "h2" and stores it in the name field. The variable level is bound from extra_fields["level"].

If level lives in a nested attrs object (extra_fields["attrs"]["level"]), the variable attrs.level is also bound automatically. The same expression using Expr::Var(Arc::from("attrs.level")) works regardless of whether level is flat or nested.

The evaluator runs the expression using panproto_expr::EvalConfig::default(). If evaluation fails (a field is missing, for instance), the field is left unchanged.

22.5 Case: conditional transforms based on runtime values

Case is the dependent function space for field transforms. It applies different transforms depending on what the data says. Each branch is a (predicate, transforms) pair. The first branch whose predicate evaluates to true fires; the rest are skipped.

Given branches $(p_1, \vec{f}_1), \ldots, (p_k, \vec{f}_k)$ and environment $e$:

\[\text{Case}(e) = \vec{f}_i(e) \quad \text{where } i = \min\{j : p_j(e) = \text{true}\}\]

If no predicate matches, $e$ passes through unchanged.

The canonical use case is matching against attribute values:

use panproto_expr::{Expr, Literal, BuiltinOp, CaseBranch};
use std::sync::Arc;

let case = FieldTransform::Case {
    branches: vec![
        CaseBranch {
            predicate: Expr::Builtin(
                BuiltinOp::Eq,
                vec![
                    Expr::Var(Arc::from("level")),
                    Expr::Lit(Literal::Int(1)),
                ],
            ),
            transforms: vec![FieldTransform::ComputeField {
                target_key: "name".to_owned(),
                expr: Expr::Lit(Literal::Str("h1".to_owned())),
            }],
        },
        CaseBranch {
            predicate: Expr::Builtin(
                BuiltinOp::Eq,
                vec![
                    Expr::Var(Arc::from("level")),
                    Expr::Lit(Literal::Int(2)),
                ],
            ),
            transforms: vec![FieldTransform::ComputeField {
                target_key: "name".to_owned(),
                expr: Expr::Lit(Literal::Str("h2".to_owned())),
            }],
        },
        // ... and so on for h3
    ],
};

migration.add_case_transform("heading", vec![/* branches */]);

A node with level: 1 gets name set to "h1". A node with level: 2 gets "h2". A node whose level matches none of the predicates passes through unchanged.

Predicates are evaluated with the same variable environment as ComputeField: all extra_fields and attrs.* entries are bound. The comparison builtins (BuiltinOp::Eq, BuiltinOp::Lt, BuiltinOp::Contains, etc.) are all available.

Case branches can contain any sequence of FieldTransform variants, including nested Case blocks and PathTransform wrappers. The transforms in the matching branch execute in order before moving on.

Exercise: Is Case order-sensitive?

Since Case fires the first matching branch, the order of branches matters. If two predicates overlap (e.g., level < 3 and level == 2), the branch listed first wins. How should you order branches to avoid accidentally shadowing a more specific predicate with a more general one?

Answer

Order branches from most specific to most general: put level == 2 before level < 3. The otherwise branch (if present) should always be last, since it matches everything.

22.6 ConditionalSurvival: value-dependent vertex survival

The FieldTransform variants operate on nodes already anchored to surviving vertices. conditional_survival adds a second, value-dependent gate before field transforms run.

If a vertex has a conditional survival predicate, panproto evaluates it against the node’s extra_fields. If it returns false, the node is dropped, treated exactly as if its anchor were not in surviving_verts. Its descendants undergo ancestor contraction just as they would for a structurally pruned node.

// keep only "item" nodes where level == 2
let predicate = Expr::Builtin(
    BuiltinOp::Eq,
    vec![
        Expr::Var(Arc::from("level")),
        Expr::Lit(Literal::Int(2)),
    ],
);
migration.add_conditional_survival("item", predicate);

This is the matching pattern at the survival level rather than the transform level. Use add_conditional_survival when you want to keep only nodes of a given type that match a condition. Use add_case_transform when you want all nodes to survive but behave differently depending on their values.

The two interact cleanly: conditional_survival runs first. Nodes that pass then undergo field_transforms in the usual order.

Exercise: What happens to dropped nodes’ descendants?

When conditional_survival drops a node, its children undergo ancestor contraction. If a dropped heading node has inline children (strong, em), do those children get re-parented to the dropped node’s parent, or are they dropped recursively?

Answer

They are re-parented to the dropped node’s nearest surviving ancestor. The conditional survival predicate removes only the node it targets; surviving descendants are preserved by the standard ancestor contraction mechanism.

22.7 Worked example: heading levels

Normalizing heading-level encodings is the most common use for value-dependent migration. Prosemirror and Pandoc encode heading level as an integer attribute on a single heading vertex. Older HTML-derived formats use separate h1, h2, h3 vertex types.

Migrating from separate vertices to the unified form requires two steps:

Rename h1, h2, h3 all to heading (three RenameVertex transforms at the schema level).
Add a level attribute to each node with the correct integer value.

Step 2 is value-dependent: the source vertex name determines the value.

use panproto_expr::{Expr, Literal, BuiltinOp, CaseBranch};
use std::sync::Arc;

// the schema migration renames h1/h2/h3 to heading.
// field transforms add the level attribute after remapping.

// for nodes originally anchored to h1:
migration.add_field_default("h1", "level", Value::Int(1));

// for nodes originally anchored to h2:
migration.add_field_default("h2", "level", Value::Int(2));

// for nodes originally anchored to h3:
migration.add_field_default("h3", "level", Value::Int(3));

Note

add_field_default uses the source vertex name as the key. The transform is registered against "h1", which is the anchor name at the time the node is processed. The anchor is remapped to "heading" after field transforms run, so the lookup is against the original name.

In the reverse direction (migrating from the unified form back to separate vertices), the structural transforms map heading back to h1, h2, h3. But how does panproto know which heading node becomes h1 versus h2? It doesn’t, unless you add a ConditionalSurvival predicate for each target vertex. The full bidirectional version requires each target vertex to survive only for the matching level value:

// reverse migration: heading -> h1/h2/h3
// structural: RenameVertex("heading", "h1"), RenameVertex("heading", "h2"), ...
// (this is actually a split, handled by the schema-level SplitVertex transform.)

// value-dependent survival gates each rename:
migration.add_conditional_survival(
    "heading",
    Expr::Builtin(
        BuiltinOp::Eq,
        vec![
            Expr::Var(Arc::from("level")),
            Expr::Lit(Literal::Int(1)),
        ],
    ),
);

Exercise: Can ConditionalSurvival express disjunctions?

The heading example uses one predicate per target vertex. If you needed a single vertex to survive when level == 1 or level == 2 (collapsing h1 and h2 into one target), can you express that with a single ConditionalSurvival predicate, or do you need a different mechanism?

Answer

Yes. The predicate is an arbitrary Expr, so you can use BuiltinOp::Or to combine conditions: Expr::Builtin(BuiltinOp::Or, vec![level_eq_1, level_eq_2]). A single ConditionalSurvival predicate with a disjunction is equivalent to (and simpler than) registering multiple predicates or using a Case transform to achieve the same effect.

22.8 Worked example: CSS class-based matching

The pattern of checking whether a list-valued attribute includes a specific token uses BuiltinOp::Contains. panproto encodes JSON arrays in extra_fields as Value::Unknown maps with an __array_len sentinel. When this map is used as an expression variable, it is serialized as a comma-separated string, making Contains work as a membership test.

Consider a microformat migration: keep only a elements whose class array contains "u-url".

// keep only anchor nodes where class contains "u-url"
let predicate = Expr::Builtin(
    BuiltinOp::Contains,
    vec![
        Expr::Var(Arc::from("class")),
        Expr::Lit(Literal::Str("u-url".to_owned())),
    ],
);
migration.add_conditional_survival("a", predicate);

If a node has class: ["u-url", "external"], the variable class is bound to the string "u-url,external". Contains("u-url,external", "u-url") returns true, and the node survives. A node with class: ["u-email"] fails the predicate and is dropped.

Note

The comma-separated serialization comes from how value_to_expr_literal converts encoded arrays. Membership tests using Contains work correctly as long as no element contains a comma. For elements that do, use a more precise predicate (checking for "u-url," or ",u-url"), or restructure the data before migration.

The same Contains predicate can drive Case branches instead of survival. If you want all a elements to survive but behave differently based on which microformat token they carry:

let case = FieldTransform::Case {
    branches: vec![
        CaseBranch {
            predicate: Expr::Builtin(
                BuiltinOp::Contains,
                vec![
                    Expr::Var(Arc::from("class")),
                    Expr::Lit(Literal::Str("u-url".to_owned())),
                ],
            ),
            transforms: vec![FieldTransform::AddField {
                key: "link_type".to_owned(),
                value: Value::Str("url".to_owned()),
            }],
        },
        CaseBranch {
            predicate: Expr::Builtin(
                BuiltinOp::Contains,
                vec![
                    Expr::Var(Arc::from("class")),
                    Expr::Lit(Literal::Str("u-email".to_owned())),
                ],
            ),
            transforms: vec![FieldTransform::AddField {
                key: "link_type".to_owned(),
                value: Value::Str("email".to_owned()),
            }],
        },
    ],
};

migration.add_case_transform("a", vec![/* branches above */]);

Nodes with u-url in their class get link_type: "url". Nodes with u-email get link_type: "email". Nodes matching neither branch are unchanged.

# Value-Dependent Migration {#sec-value-dependent-transforms} You encounter a heading element: it carries the level as a vertex name (`h1`, `h2`, `h3`). Later, you want a single `heading` vertex with a `level` attribute. The structural transform to rename is trivial. But where does the level attribute come from? It comes from what the data *says*—the old vertex name itself. None of panproto's structural schema operations can express that relationship. `FieldTransform` and `conditional_survival` form the value-level layer. They operate on the node's field environment at runtime, reading what the data contains and transforming fields accordingly. ## Why structural transforms are not enough {#sec-why-value-transforms} Consider the heading problem in detail. Your source schema has separate vertices for each level: `h1`, `h2`, `h3`. You want to consolidate them into a single `heading` vertex with an integer `level` attribute. The structural part is easy—rename all three to `heading`. But now every node reads as `heading` with no level. The missing piece is value-dependent: you need the original vertex name to compute the level value. Here is a second pattern. Document formats sometimes encode semantic roles in CSS classes. An `a` element with `class: "u-url"` is a microformat URL link; one with `class: "u-email"` is for email. To normalize to a single `link` vertex type, you must keep only elements whose class contains a recognized token. Whether a node survives depends on what its `class` field actually holds. `FieldTransform` handles the first case (computed fields based on data values). `conditional_survival` handles the second (dropping nodes that don't match a value-based predicate). ## PathTransform: navigating nested structures {#sec-path-transform} A node's `extra_fields` is a flat map. But the values themselves can be maps (nested objects), forming a tree. Some formats store attributes in nested objects: `{ "attrs": { "level": 2, "style": "bold" } }`. `PathTransform` applies any field transform at a specified path within this tree: ```{.rust} // rename "old_attr" -> "new_attr" inside the "attrs" nested object migration.add_path_transform( "heading", &["attrs"], FieldTransform::RenameField { old_key: "old_attr".to_owned(), new_key: "new_attr".to_owned(), }, ); ``` The path is a sequence of keys to navigate. `&["attrs"]` means: find the `attrs` field, treat it as a map, and apply the inner transform to that map's fields. Deeper paths are supported: `&["attrs", "styles"]` navigates two levels. An empty path applies the inner transform directly to the node's own `extra_fields`. `PathTransform` composes with all other variants. You can nest a `Case` inside a `PathTransform` for conditional logic at nested levels, or nest a `ComputeField` inside `PathTransform` to compute values using both top-level fields and the nested subtree. ## MapReferences: updating vertex-referencing strings {#sec-map-references} When you rename a vertex, any fields that carry that vertex name as a string go stale. Parent reference arrays in tree-structured formats (`"parents": ["root", "body"]`) are the canonical example: rename `"body"` to `"main"`, and those parent reference strings need to follow. `MapReferences` applies a rename map to a string field: ```{.rust} let mut rename_map = HashMap::new(); rename_map.insert("body".to_owned(), Some("main".to_owned())); rename_map.insert("sidebar".to_owned(), None); // drop references to removed vertex migration.add_map_references("paragraph", "parents", rename_map); ``` For each `paragraph` node, the `parents` field is updated: - `Value::Str("body")` becomes `Value::Str("main")`. - `Value::Str("sidebar")` is removed (mapped to `None`). - String values not in the rename map pass through unchanged. The transform handles both flat strings and encoded arrays. panproto encodes arrays in `extra_fields` as `Value::Unknown` maps with an `__array_len` sentinel. `MapReferences` detects this and iterates over array elements, renaming or dropping each one. References mapped to `None` are removed; the array is compacted and its length updated. This is the functorial action of vertex rename on the name-reference algebra: every structural rename should be accompanied by `MapReferences` on any field that carries those names. ## ComputeField: expression-based field computation {#sec-compute-field} `ComputeField` evaluates an expression against a node's full field environment and stores the result in a target field. Unlike `ApplyExpr` (which binds a single field's value), `ComputeField` makes all `extra_fields` available as variables. The heading level example: ```{.rust} use panproto_expr::{Expr, Literal, BuiltinOp}; use std::sync::Arc; // compute: (concat "h" (int_to_str level)) let expr = Expr::Builtin( BuiltinOp::Concat, vec![ Expr::Lit(Literal::Str("h".to_owned())), Expr::Builtin( BuiltinOp::IntToStr, vec![Expr::Var(Arc::from("level"))], ), ], ); migration.add_computed_field("heading", "name", expr); ``` When a `heading` node with `level: 2` is processed, this expression evaluates to `"h2"` and stores it in the `name` field. The variable `level` is bound from `extra_fields["level"]`. If level lives in a nested `attrs` object (`extra_fields["attrs"]["level"]`), the variable `attrs.level` is also bound automatically. The same expression using `Expr::Var(Arc::from("attrs.level"))` works regardless of whether `level` is flat or nested. The evaluator runs the expression using `panproto_expr::EvalConfig::default()`. If evaluation fails (a field is missing, for instance), the field is left unchanged. ## Case: conditional transforms based on runtime values {#sec-case-transform} `Case` is the dependent function space for field transforms. It applies different transforms depending on what the data says. Each branch is a (predicate, transforms) pair. The first branch whose predicate evaluates to `true` fires; the rest are skipped. Given branches $(p_1, \vec{f}_1), \ldots, (p_k, \vec{f}_k)$ and environment $e$: $$\text{Case}(e) = \vec{f}_i(e) \quad \text{where } i = \min\{j : p_j(e) = \text{true}\}$$ If no predicate matches, $e$ passes through unchanged. The canonical use case is matching against attribute values: ```{.rust} use panproto_expr::{Expr, Literal, BuiltinOp, CaseBranch}; use std::sync::Arc; let case = FieldTransform::Case { branches: vec![ CaseBranch { predicate: Expr::Builtin( BuiltinOp::Eq, vec![ Expr::Var(Arc::from("level")), Expr::Lit(Literal::Int(1)), ], ), transforms: vec![FieldTransform::ComputeField { target_key: "name".to_owned(), expr: Expr::Lit(Literal::Str("h1".to_owned())), }], }, CaseBranch { predicate: Expr::Builtin( BuiltinOp::Eq, vec![ Expr::Var(Arc::from("level")), Expr::Lit(Literal::Int(2)), ], ), transforms: vec![FieldTransform::ComputeField { target_key: "name".to_owned(), expr: Expr::Lit(Literal::Str("h2".to_owned())), }], }, // ... and so on for h3 ], }; migration.add_case_transform("heading", vec![/* branches */]); ``` A node with `level: 1` gets `name` set to `"h1"`. A node with `level: 2` gets `"h2"`. A node whose `level` matches none of the predicates passes through unchanged. Predicates are evaluated with the same variable environment as `ComputeField`: all `extra_fields` and `attrs.*` entries are bound. The comparison builtins (`BuiltinOp::Eq`, `BuiltinOp::Lt`, `BuiltinOp::Contains`, etc.) are all available. `Case` branches can contain any sequence of `FieldTransform` variants, including nested `Case` blocks and `PathTransform` wrappers. The transforms in the matching branch execute in order before moving on. ::: {.callout-caution} ## Exercise: Is Case order-sensitive? Since `Case` fires the *first* matching branch, the order of branches matters. If two predicates overlap (e.g., `level < 3` and `level == 2`), the branch listed first wins. How should you order branches to avoid accidentally shadowing a more specific predicate with a more general one? ::: ::: {.callout-tip collapse=true} ## Answer Order branches from most specific to most general: put `level == 2` before `level < 3`. The `otherwise` branch (if present) should always be last, since it matches everything. ::: ## ConditionalSurvival: value-dependent vertex survival {#sec-conditional-survival} The `FieldTransform` variants operate on nodes already anchored to surviving vertices. `conditional_survival` adds a second, value-dependent gate *before* field transforms run. If a vertex has a conditional survival predicate, panproto evaluates it against the node's `extra_fields`. If it returns `false`, the node is dropped, treated exactly as if its anchor were not in `surviving_verts`. Its descendants undergo ancestor contraction just as they would for a structurally pruned node. ```{.rust} // keep only "item" nodes where level == 2 let predicate = Expr::Builtin( BuiltinOp::Eq, vec![ Expr::Var(Arc::from("level")), Expr::Lit(Literal::Int(2)), ], ); migration.add_conditional_survival("item", predicate); ``` This is the matching pattern at the survival level rather than the transform level. Use `add_conditional_survival` when you want to keep only nodes of a given type that match a condition. Use `add_case_transform` when you want all nodes to survive but behave differently depending on their values. The two interact cleanly: `conditional_survival` runs first. Nodes that pass then undergo `field_transforms` in the usual order. ::: {.callout-caution} ## Exercise: What happens to dropped nodes' descendants? When `conditional_survival` drops a node, its children undergo ancestor contraction. If a dropped `heading` node has inline children (`strong`, `em`), do those children get re-parented to the dropped node's parent, or are they dropped recursively? ::: ::: {.callout-tip collapse=true} ## Answer They are re-parented to the dropped node's nearest surviving ancestor. The conditional survival predicate removes only the node it targets; surviving descendants are preserved by the standard ancestor contraction mechanism. ::: ## Worked example: heading levels {#sec-heading-example} Normalizing heading-level encodings is the most common use for value-dependent migration. Prosemirror and Pandoc encode heading level as an integer attribute on a single `heading` vertex. Older HTML-derived formats use separate `h1`, `h2`, `h3` vertex types. Migrating from separate vertices to the unified form requires two steps: 1. Rename `h1`, `h2`, `h3` all to `heading` (three `RenameVertex` transforms at the schema level). 2. Add a `level` attribute to each node with the correct integer value. Step 2 is value-dependent: the source vertex name determines the value. ```{.rust} use panproto_expr::{Expr, Literal, BuiltinOp, CaseBranch}; use std::sync::Arc; // the schema migration renames h1/h2/h3 to heading. // field transforms add the level attribute after remapping. // for nodes originally anchored to h1: migration.add_field_default("h1", "level", Value::Int(1)); // for nodes originally anchored to h2: migration.add_field_default("h2", "level", Value::Int(2)); // for nodes originally anchored to h3: migration.add_field_default("h3", "level", Value::Int(3)); ``` ::: {.callout-note} `add_field_default` uses the *source* vertex name as the key. The transform is registered against `"h1"`, which is the anchor name at the time the node is processed. The anchor is remapped to `"heading"` *after* field transforms run, so the lookup is against the original name. ::: In the reverse direction (migrating from the unified form back to separate vertices), the structural transforms map `heading` back to `h1`, `h2`, `h3`. But how does panproto know which heading node becomes `h1` versus `h2`? It doesn't, unless you add a `ConditionalSurvival` predicate for each target vertex. The full bidirectional version requires each target vertex to survive only for the matching level value: ```{.rust} // reverse migration: heading -> h1/h2/h3 // structural: RenameVertex("heading", "h1"), RenameVertex("heading", "h2"), ... // (this is actually a split, handled by the schema-level SplitVertex transform.) // value-dependent survival gates each rename: migration.add_conditional_survival( "heading", Expr::Builtin( BuiltinOp::Eq, vec![ Expr::Var(Arc::from("level")), Expr::Lit(Literal::Int(1)), ], ), ); ``` ::: {.callout-caution} ## Exercise: Can ConditionalSurvival express disjunctions? The heading example uses one predicate per target vertex. If you needed a single vertex to survive when `level == 1` *or* `level == 2` (collapsing h1 and h2 into one target), can you express that with a single `ConditionalSurvival` predicate, or do you need a different mechanism? ::: ::: {.callout-tip collapse=true} ## Answer Yes. The predicate is an arbitrary `Expr`, so you can use `BuiltinOp::Or` to combine conditions: `Expr::Builtin(BuiltinOp::Or, vec![level_eq_1, level_eq_2])`. A single `ConditionalSurvival` predicate with a disjunction is equivalent to (and simpler than) registering multiple predicates or using a `Case` transform to achieve the same effect. ::: ## Worked example: CSS class-based matching {#sec-css-class-example} The pattern of checking whether a list-valued attribute includes a specific token uses `BuiltinOp::Contains`. panproto encodes JSON arrays in `extra_fields` as `Value::Unknown` maps with an `__array_len` sentinel. When this map is used as an expression variable, it is serialized as a comma-separated string, making `Contains` work as a membership test. Consider a microformat migration: keep only `a` elements whose `class` array contains `"u-url"`. ```{.rust} // keep only anchor nodes where class contains "u-url" let predicate = Expr::Builtin( BuiltinOp::Contains, vec![ Expr::Var(Arc::from("class")), Expr::Lit(Literal::Str("u-url".to_owned())), ], ); migration.add_conditional_survival("a", predicate); ``` If a node has `class: ["u-url", "external"]`, the variable `class` is bound to the string `"u-url,external"`. `Contains("u-url,external", "u-url")` returns `true`, and the node survives. A node with `class: ["u-email"]` fails the predicate and is dropped. ::: {.callout-note} The comma-separated serialization comes from how `value_to_expr_literal` converts encoded arrays. Membership tests using `Contains` work correctly as long as no element contains a comma. For elements that do, use a more precise predicate (checking for `"u-url,"` or `",u-url"`), or restructure the data before migration. ::: The same `Contains` predicate can drive `Case` branches instead of survival. If you want all `a` elements to survive but behave differently based on which microformat token they carry: ```{.rust} let case = FieldTransform::Case { branches: vec![ CaseBranch { predicate: Expr::Builtin( BuiltinOp::Contains, vec![ Expr::Var(Arc::from("class")), Expr::Lit(Literal::Str("u-url".to_owned())), ], ), transforms: vec![FieldTransform::AddField { key: "link_type".to_owned(), value: Value::Str("url".to_owned()), }], }, CaseBranch { predicate: Expr::Builtin( BuiltinOp::Contains, vec![ Expr::Var(Arc::from("class")), Expr::Lit(Literal::Str("u-email".to_owned())), ], ), transforms: vec![FieldTransform::AddField { key: "link_type".to_owned(), value: Value::Str("email".to_owned()), }], }, ], }; migration.add_case_transform("a", vec![/* branches above */]); ``` Nodes with `u-url` in their class get `link_type: "url"`. Nodes with `u-email` get `link_type: "email"`. Nodes matching neither branch are unchanged.