Stack Machine & Compiler Correctness
Compiler Verification SystemsEnd-to-end verification of a compiler from arithmetic expressions to stack machine instructions — proving that compiled programs always produce correct results.
Overview
Compiler correctness is one of the most important problems in formal verification. An incorrect compiler can silently transform correct source code into buggy executables, undermining all source-level reasoning.
This case study presents two verified systems:
- A verified compiler — compiling arithmetic expressions to stack machine instructions, with a proof that compiled code always evaluates to the correct result
- A verified stack machine program — proving that a specific loop program correctly computes the sum
1 + 2 + ... + n
Part 1: Verified Compiler
The complete development is self-contained IML you can paste into ImandraX.
Source Language
type expr =
| Int of int
| Plus of expr * expr
| Times of expr * expr
| Sub of expr * expr
| Div of expr * expr
let rec eval_expr e =
match e with
| Int i -> i
| Plus (x,y) ->
let e1, e2 = eval_expr x, eval_expr y in e1 + e2
| Times (x,y) ->
let e1, e2 = eval_expr x, eval_expr y in e1 * e2
| Sub (x,y) ->
let e1, e2 = eval_expr x, eval_expr y in e1 - e2
| Div (x,y) ->
let e1, e2 = eval_expr x, eval_expr y in e1 / e2Target: Stack Machine
type instr = Add | Mult | Minus | Quot | Push of int
type state = { stack : int list }
let apply (s : state) (f : int -> int -> int) =
match s.stack with
| a :: b :: rst -> { stack = (f a b) :: rst }
| _ -> s
let step s inst =
match inst with
| Add -> apply s (+)
| Mult -> apply s ( * )
| Minus -> apply s (fun x y -> y - x)
| Quot -> apply s (fun x y -> y / x)
| Push i -> { stack = i :: s.stack }
let rec run s program =
match program with
| [] -> s
| i :: i' -> run (step s i) i'
[@@adm program]The Compiler
let rec compile (e : expr) : instr list =
match e with
| Int i -> [Push i]
| Plus (x,y) -> compile x @ compile y @ [Add]
| Times (x,y) -> compile x @ compile y @ [Mult]
| Sub (x,y) -> compile x @ compile y @ [Minus]
| Div (x,y) -> compile x @ compile y @ [Quot]Correctness Theorem
The main theorem: running the compiled code on any initial stack always pushes the correct value:
theorem compiler_correct s e =
run s (compile e) = { stack = (eval_expr e) :: s.stack }
[@@by induction ~id:[%id compiler_induct] () @>>| auto]The proof uses a custom induction scheme compiler_induct that captures the recursive structure of compilation — crucially, the inductive hypothesis for the second operand uses a stack already containing the first operand's result:
let rec compiler_induct s e =
let ih s x y =
compiler_induct s x
&& compiler_induct { stack = (eval_expr x) :: s.stack } y
in
match e with
| Plus (x,y) | Times (x,y) | Sub (x,y) | Div (x,y) -> ih s x y
| _ -> trueWhy Custom Induction Is Needed
Standard structural induction on expressions doesn't work for the compiler correctness theorem. The problem is that after compiling the first operand of Plus(x,y), the machine state has changed — eval_expr x is now on the stack. The inductive hypothesis for y must account for this modified state.
The compiler_induct scheme captures this by threading the evaluation result through:
- For
Plus(x,y): prove correctness forxwith current stacks, then prove correctness forywith stackeval_expr(x) :: s.stack - This ensures the IH for the second operand matches the actual machine state after the first operand's code has run
This is a classic challenge in compiler verification — the "accumulator threading" problem.
A key enabling lemma establishes that running concatenated programs is equivalent to running them sequentially:
lemma run_assoc s program_1 program_2 =
run s (program_1 @ program_2) = run (run s program_1) program_2
[@@by auto] [@@rw]Part 2: Stack Machine Program Verification
A more detailed verification of a specific stack machine program that computes sum(n) = 1 + 2 + ... + n:
let ex : prog = [
Const 0; (* 0: initialize accumulator *)
Store RA; (* 1: RA = 0 *)
Load RN; (* 2: push n *)
If_leq 10; (* 3: if n <= 0, jump to halt *)
Load RN; (* 4: push n *)
Load RA; (* 5: push accumulator *)
Add; (* 6: n + acc *)
Store RA; (* 7: RA = n + acc *)
Load RN; (* 8: push n *)
Const 1; (* 9: push 1 *)
Sub; (* 10: n - 1 *)
Store RN; (* 11: RN = n - 1 *)
Goto (-10); (* 12: jump back to loop head *)
Halt; (* 13: done *)
]The Correctness Theorem
theorem main stack ra rn s steps =
steps >= 11*rn + 5
&& rn >= 0
&& s = { halted=false; prog=ex; pc=0; stack; ra; rn }
==>
match run s steps with
{ halted; ra; _ } ->
halted=true && ra = sum rnThe proof proceeds by:
- Initialization — proving lines 0–1 set up the accumulator
- Loop body — proving each iteration correctly adds
rnto the accumulator and decrementsrn - Loop invariant — by induction with a custom scheme, proving the loop correctly computes
sum(rn) - Composition — combining initialization, loop, and termination
Verification Methodology
The stack machine proof proceeds in layers:
- Instruction table lemma — establishes what instruction lives at each program counter address
- Initialization (lines 0–1) — proves
RAis set to 0 in 2 steps - Base case (lines 2–3) — when
rn ≤ 0, the loop exits toHaltin 2 more steps - Loop body (lines 2–12) — one iteration adds
rntoraand decrementsrn, taking exactly 11 steps - Loop invariant — by induction with a custom measure
Ordinal.of_int rn, proves the loop correctly computessum(rn) - Final assembly — combines initialization, loop, and termination
Each layer uses unroll to verify fixed-length instruction sequences and %use to compose the layers.
End-to-End Correctness
The compiler is proved correct for all expressions and all initial machine states — not just tested on examples.
Custom Induction Schemes
Both proofs use carefully crafted induction schemes that capture the recursive structure of the problem.
Two Levels of Verification
From high-level compiler correctness to low-level instruction-by-instruction loop reasoning.
References
- Source:
compiler.iml,stack-machine.iml