23 Querying Instances
Every schema instance is a graph: vertices carry data, edges encode relationships. Previous chapters showed how to build instances, version them, and migrate their data. But now you want to ask questions about the data: which posts got the most engagement? Which annotations belong to a particular layer? Which nodes satisfy some computed condition? The query engine answers these questions.
panproto provides two pieces that work together. The first is a Haskell-style expression language for writing predicates, projections, and computed values. The second is a declarative query engine (InstanceQuery) that uses those expressions to select, filter, and reshape nodes from an instance. This chapter covers both.
23.1 The expression language
The expression language is a small functional language designed to feel natural if you’ve written Haskell, ML, or even spreadsheet formulas. It appears everywhere panproto needs to express computation: query predicates, field transforms (as seen in Chapter 22), computed fields, and conditional survival.
23.1.1 Literals
The basics: integers, floats, strings, booleans, and the absent value.
42
3.125
"hello"
True
False
NothingNothing represents a missing or absent value. It is distinct from an empty string or zero.
23.1.2 Variables and field access
Variables are lowercase identifiers. When an expression is evaluated against a node, the node’s fields are bound as variables in the evaluation environment.
x
name
ageDotted field access reaches into nested structures:
node.name
doc.attrs.level23.1.3 Arithmetic and comparison
Standard arithmetic operators work on integers and floats:
x + 1
2 * y
a - b
total / count
n mod 3Comparisons return booleans:
x == 1
age > 18
score <= 100
name /= "admin"Note that inequality is /= (the Haskell convention), not !=.
23.1.4 Boolean logic
a && b
x || y
not flagnot is a keyword, not a function. && and || are short-circuiting.
23.1.5 String concatenation
The ++ operator concatenates strings (and lists):
"hello" ++ " " ++ "world"23.1.6 Lambda expressions
Anonymous functions use backslash syntax:
\x -> x + 1
\x y -> x * y + 1Lambdas are first class values. You can pass them to map, filter, and other higher-order builtins.
23.1.7 Let bindings
Local bindings with let ... in:
let x = 1 in x + 2
let
a = 10
b = 20
in a + bIndentation-based layout works the same way as in Haskell: the let keyword opens a layout block, and bindings at the same indentation level are grouped together.
23.1.8 Conditionals
if age > 18 then "adult" else "minor"Both branches are required. The types of the two branches should agree.
23.1.9 Case expressions
Pattern matching on values:
case x of
True -> 1
False -> 0Case expressions also support guards and otherwise:
case level of
1 -> "heading"
2 -> "subheading"
otherwise -> "paragraph"23.1.10 Records
Record literals use curly braces with = for field bindings:
{ name = "alice", age = 30 }Record punning lets you omit the value when the field name matches a variable in scope:
let name = "alice"
age = 30
in { name, age }This produces the same record as the explicit version.
23.1.11 Lists and comprehensions
List literals:
[1, 2, 3]
["a", "b", "c"]List comprehensions follow Haskell syntax, with generators (<-) and guards:
[ x + 1 | x <- xs, x > 0 ]This reads: for each x drawn from xs, if x > 0, yield x + 1. You can combine multiple generators and guards:
[ x ++ y | x <- prefixes, y <- suffixes, length (x ++ y) < 10 ]23.1.12 Edge traversal
The -> operator, when used between identifiers in a graph context, navigates edges:
doc -> layers -> annotationsThis follows the layers edge from doc, then the annotations edge from the result. Edge traversal is the expression-level equivalent of the path field in InstanceQuery (covered below).
23.1.13 Builtin functions
A set of builtin functions is always available:
| Function | Description |
|---|---|
map f xs |
Apply f to every element of xs |
filter p xs |
Keep elements where p returns True |
fold f z xs |
Left fold over xs with initial value z |
head xs |
First element of xs |
tail xs |
All elements of xs except the first |
length xs |
Number of elements in xs |
concat xss |
Flatten a list of lists |
reverse xs |
Reverse a list |
take n xs |
First n elements of xs |
drop n xs |
All but the first n elements |
These compose naturally with lambdas:
filter (\x -> x > 0) [1, -2, 3, -4, 5]
map (\x -> x * 2) (filter (\x -> x > 0) xs)
fold (\acc x -> acc + x) 0 scores23.2 The declarative query engine
The query engine operates on WInstance values (W-type instances, the tree-shaped data that conforms to a schema). An InstanceQuery is a declarative description of what you want: which vertex type to start from, which edges to follow, what conditions to check, and which fields to return. The engine executes a fixed pipeline.
23.2.1 Query structure
An InstanceQuery has six fields, all optional except anchor:
| Field | Type | Purpose |
|---|---|---|
anchor |
Name |
Which vertex type to select (required) |
path |
[Name] |
Edge kinds to traverse before matching |
predicate |
Expr |
Boolean expression evaluated per node |
group_by |
String |
Field to partition results by |
project |
[String] |
Fields to include in each result |
limit |
usize |
Maximum number of results |
The execution pipeline runs in this order:
- Anchor selection: find all nodes whose vertex type matches
anchor. - Path navigation: if
pathis specified, follow edges from the anchored nodes. Each element inpathnames an edge kind; the engine collects all children reachable via that edge, then continues from those children for the next path element. - Predicate filtering: evaluate the
predicateexpression for each candidate node. The node’sextra_fieldsare bound as variables, plus_anchor(the vertex type) and_id(the node identifier). Only nodes where the predicate evaluates toTruesurvive. - Limit: truncate to at most
limitresults. - Projection: if
projectis specified, include only those fields in the output. Otherwise, all fields are returned.
23.2.2 Anchor selection
The simplest possible query selects all nodes of a given vertex type:
use panproto_inst::query::{InstanceQuery, execute};
let query = InstanceQuery {
anchor: "post".into(),
..Default::default()
};
let results = execute(&query, &instance);
// results contains every node anchored to "post"This is the starting point for every query. If your schema has 200 post nodes and 50 comment nodes, anchoring on "post" gives you the 200 posts.
23.2.3 Predicate filtering
Add a predicate to keep only the nodes that match a condition. The expression is evaluated once per candidate node, with the node’s fields bound as variables.
Find all posts with more than 10 likes:
use panproto_expr::{Expr, Literal, BuiltinOp};
let query = InstanceQuery {
anchor: "post".into(),
predicate: Some(Expr::Builtin(
BuiltinOp::Gt,
vec![
Expr::Var("likes".into()),
Expr::Lit(Literal::Int(10)),
],
)),
..Default::default()
};
let results = execute(&query, &instance);In the surface syntax, the predicate is simply likes > 10. The Rust API requires building the AST explicitly, but the intent is the same: for each post node, check whether its likes field exceeds 10.
23.2.5 Projection
By default, every field on a matched node is included in the result. Projection lets you ask for only the fields you care about.
Get just the titles of all documents:
let query = InstanceQuery {
anchor: "document".into(),
project: Some(vec!["title".into()]),
..Default::default()
};
let results = execute(&query, &instance);
// each result has only the "title" fieldThis is useful when nodes carry many fields but you only need one or two. Projection does not affect filtering; the predicate still has access to all fields when it evaluates.
23.2.6 Combining everything
A realistic query combines several of these pieces. Find the first 20 annotations on the "markup" layer with confidence above 0.8, returning only the label and confidence fields:
let query = InstanceQuery {
anchor: "layer".into(),
path: vec!["annotations".into()],
predicate: Some(Expr::Builtin(
BuiltinOp::Gt,
vec![
Expr::Var("confidence".into()),
Expr::Lit(Literal::Float(0.8)),
],
)),
project: Some(vec!["label".into(), "confidence".into()]),
limit: Some(20),
..Default::default()
};
let results = execute(&query, &instance);The pipeline executes left to right: anchor on layer, follow annotations edges, keep nodes where confidence > 0.8, take at most 20, and project to label and confidence.
23.3 Practical examples
23.3.1 Collecting field values across nodes
A common pattern is collecting all values of a particular field across matching nodes. Suppose you want all distinct tags used in your posts:
let query = InstanceQuery {
anchor: "post".into(),
project: Some(vec!["tags".into()]),
..Default::default()
};
let results = execute(&query, &instance);
// collect unique tags across all posts
let all_tags: Vec<&Value> = results
.iter()
.filter_map(|r| r.fields.get("tags"))
.collect();The query itself returns every post with only its tags field. The aggregation (collecting unique values) happens in your application code. The query engine handles selection and projection; your code handles aggregation.
23.3.2 Filtering with compound predicates
Predicates can combine multiple conditions. Find posts that have more than 10 likes and were written by a specific author:
let query = InstanceQuery {
anchor: "post".into(),
predicate: Some(Expr::Builtin(
BuiltinOp::And,
vec![
Expr::Builtin(
BuiltinOp::Gt,
vec![
Expr::Var("likes".into()),
Expr::Lit(Literal::Int(10)),
],
),
Expr::Builtin(
BuiltinOp::Eq,
vec![
Expr::Var("author".into()),
Expr::Lit(Literal::Str("alice".into())),
],
),
],
)),
..Default::default()
};In the surface syntax this would be likes > 10 && author == "alice".
23.3.3 Using comprehensions to reshape results
List comprehensions in the expression language are useful for computed predicates and transforms. You might use a comprehension inside a ComputeField to derive a summary value:
let active = [ u | u <- users, u.lastLogin > cutoff ]
in length activeThis counts users whose lastLogin exceeds a cutoff. As a ComputeField expression, it would compute a derived metric on a summary node.
23.4 CLI usage
The schema expr subcommands let you work with expressions from the command line. These are useful for quick experiments, debugging predicates, and validating syntax before embedding expressions in migration definitions.
23.4.1 Parsing
Parse an expression and see its AST:
schema expr parse "x + 1"Output (abbreviated):
Builtin(
Add,
[
Var("x"),
Lit(Int(1)),
],
)
This shows you exactly how the parser interprets the surface syntax. Useful when you are not sure whether an expression parses the way you expect.
23.4.2 Evaluation
Evaluate a closed expression (one with no free variables) and see the result:
schema expr eval "2 + 3"Output:
5schema expr eval "if True then 42 else 0"Output:
42Evaluation only works for expressions with no free variables, since the CLI does not provide an environment. For expressions that reference fields, use the REPL (schema expr repl) or embed them in a query.
23.4.3 Formatting
Canonicalize expression formatting:
schema expr fmt "\x->x+ 1"Output:
\x -> x + 1
This is the expression equivalent of rustfmt or prettier. It parses the expression and pretty-prints it in canonical form, normalizing whitespace and parenthesization.
23.4.4 Syntax checking
Validate that an expression parses without evaluating it:
schema expr check "let x = 1 in x + 2"Output:
OK
schema expr check "let x = in"Output:
parse error: unexpected token 'in' at byte 10
This is useful in CI pipelines or editor integrations where you want to catch syntax errors early.
23.5 TypeScript SDK usage
The TypeScript SDK wraps the WASM-compiled expression parser and query engine. The API mirrors the Rust types but uses TypeScript conventions.
23.5.1 Parsing expressions
import { parseExpr } from '@panproto/core';
const expr = parseExpr('\\x -> x + 1', panproto._wasm);
// => { type: 'lam', param: 'x', body: { type: 'builtin', op: 'Add', ... } }The returned Expr object is a tagged union with a type discriminant. You can inspect it, serialize it, or pass it to evalExpr.
23.5.2 Evaluating expressions
import { evalExpr, parseExpr } from '@panproto/core';
const expr = parseExpr('x + 1', panproto._wasm);
const result = evalExpr(
expr,
{ x: { type: 'int', value: 41 } },
panproto._wasm,
);
// => { type: 'int', value: 42 }The second argument is the environment: a record mapping variable names to Literal values. Each literal is a tagged object with type and value fields.
23.5.3 Executing queries
import { executeQuery, parseExpr } from '@panproto/core';
const predicate = parseExpr('likes > 10', panproto._wasm);
const matches = executeQuery(
{
anchor: 'post',
predicate,
projection: ['title', 'likes'],
limit: 50,
},
instance,
panproto._wasm,
);
for (const m of matches) {
console.log(m.fields.title, m.fields.likes);
}executeQuery serializes the query and instance to MessagePack, sends them to the WASM query engine, and deserializes the results. The returned QueryMatch objects have nodeId, anchor, value, and fields properties.
The parseExpr function accepts arbitrary expression source text. This means you can let users write query predicates as strings ("likes > 10 && author == \"alice\"") and parse them at runtime, rather than constructing the AST by hand. Validate with a try/catch around parseExpr to handle syntax errors gracefully.
23.5.4 Formatting expressions
import { formatExpr } from '@panproto/core';
const canonical = formatExpr('\\x->x + 1', panproto._wasm);
// => '\\x -> x + 1'This is useful for normalizing user-entered expressions before storing or displaying them.
23.6 Summary
The expression language gives you a concise, functional notation for writing predicates and computed values. The query engine gives you a declarative way to select, filter, navigate, and project over instance graphs. Together they turn a schema instance from a static data structure into something you can interrogate.
The key ideas:
- Anchor selects which vertex type to query.
- Path navigates edges before matching, scoping a query to a subgraph.
- Predicate filters nodes using an expression evaluated against each node’s fields.
- Projection controls which fields appear in the results.
- Limit caps the number of results.
Expressions and queries compose with everything else in panproto. A predicate used in a query is the same Expr type used in ConditionalSurvival, Case branches, and ComputeField transforms. Learn the expression language once, use it everywhere.