29 LLVM Integration and JIT Compilation
panproto’s expression language (panproto-expr) is interpreted by default: each Expr node is pattern-matched at runtime and evaluated recursively. For migrations that touch millions of records, interpretation overhead adds up. The panproto-jit crate eliminates it by compiling expressions to native code via LLVM. Meanwhile, panproto-llvm treats LLVM IR itself as a panproto protocol—meaning you can model compiler lowering as a theory morphism and use the full migration/lens toolkit on IR transformations.
29.1 panproto-llvm
29.1.1 Module map
| Module | Purpose |
|---|---|
protocol |
LLVM IR protocol definition (31 vertex kinds, 13 edge rules, 56 opcodes) |
lowering |
Theory morphisms: language AST to LLVM IR |
parse_ir |
inkwell-based .ll text parser (feature-gated) |
error |
LlvmError |
29.1.2 Protocol definition
The LLVM IR protocol is composed from colimit(ThGraph, ThConstraint, ThOrder) with vertex kinds for:
- Module level:
module,function,global-variable,alias - Function level:
basic-block,parameter - Instructions:
instructionwithopcodeconstraint sort - Types:
void-type,integer-type,float-type,pointer-type,array-type,vector-type,struct-type,function-type - Values:
constant,undef,poison,null,zero-initializer
29.1.3 Lowering morphisms
Three lowering morphisms are defined:
lower_typescript():ThTypeScriptFullASTtoThLLVMIRSchemalower_python():ThPythonFullASTtoThLLVMIRSchemalower_rust():ThRustFullASTtoThLLVMIRSchema
Each maps AST sorts to LLVM IR sorts (e.g., function_declaration to function, binary_expression to instruction) and edge kinds (e.g., body to entry-block, condition to operand).
29.1.4 inkwell IR parser
parse_llvm_ir creates an inkwell Context, parses the IR text via MemoryBuffer::create_from_memory_range_copy, and walks the module:
- Root
modulevertex withtarget-tripleconstraint - Functions with
linkageconstraint - Parameters with
type-ofconstraint - Basic blocks with
block-labelconstraint andentry-blockedge - Instructions with
opcodeandssa-nameconstraints
29.2 panproto-jit
29.2.1 Module map
| Module | Purpose |
|---|---|
codegen |
JitCompiler and CompiledExpr (feature-gated) |
mapping |
classify_expr and ExprMapping classification |
error |
JitError |
29.2.2 JIT compiler architecture
JitCompiler::new leaks an LLVM Context (intentional; reuse the compiler across expressions). compile(&expr) creates a module, builds a function __panproto_eval() -> i64, compiles the expression body, and JIT-executes via ORC.
Key codegen methods:
compile_expr: dispatches onExprvariantcompile_builtin: dispatches onBuiltinOp, delegates tocompile_int_binop(shared binary arithmetic),compile_int_cmp(shared comparison),compile_round(shared floor/ceil)compile_match: cascadingbrwith phi node for pattern matchingcompile_literal:i64/f64/i1constants (i64 usesfrom_ne_bytesfor sign-safe casting)
29.2.3 Compilation mapping
classify_expr statically classifies each expression node into an ExprMapping:
ArithmeticOp: maps to a single LLVM instruction (add, sub, icmp, etc.)ArrayLoop: compiles to a loop (map, filter, fold, flat_map)RuntimeCall: requires a runtime support function (string ops, list ops, etc.)Closure: lambda with captured variablesLetBinding,PatternMatch,Constant,EnvLoad: structural codegen
classify_jittable_builtin (const fn) handles the 22 builtins compilable to direct LLVM instructions. classify_runtime_builtin handles the remaining 33 that need runtime functions.
29.2.4 Lint configuration
panproto-jit uses per-crate lints instead of workspace lints because LLVM FFI requires unsafe_code = "allow". All other lint levels match the workspace (pedantic, nursery, unwrap_used = "deny").