Structurizing gotos
Problem
WebAssembly’s control flow is structured. Branches (br, br_if, br_table) only target enclosing block or loop constructs. There is no general indirect jump.
C’s goto is not structured. A goto can target any label in the same function, in any direction, across any control-flow boundary. break and continue are structured at the source level, but they have to be re-encoded along with goto once the function is rewritten as a state machine.
compiler.js handles this in two passes:
GOTO_NORMALIZER— an IIFE incompiler.jsexposing{ optimize, collectTraces, … }. Source-level AST rewrite that tries to make every(goto, label)pair resolvable by structured wasm scopes — hoisting labels and inserting fall-through plugs as needed. Best-effort.IRREDUCIBLE_LOWERING— the next IIFE in the file. Whole-function CFG rewrite. Invoked by the codegen as a per-function retry when structured emit hits an unresolvable goto. Total: any function it touches emerges as plain structured C the existing codegen handles unchanged.
This post is about the second pass.
When pass 2 runs
Not as a discrete pass over the unit. The codegen tries structured emit on every function and falls back to IRREDUCIBLE_LOWERING only on functions whose structured emit failed.
The structured codegen accumulates a gotoErrors array as it walks each function — every goto whose target it cannot resolve to a structured wasm scope produces an entry. Right before each function’s emit, the driver snapshots the lengths of the wasm body, locals, source-map entries, local-name entries, and gotoErrors. After the emit, if gotoErrors grew, the driver:
- Rolls back the appended bytes, locals, source-map entries, and local-name entries to the snapshot.
- Clears the new
gotoErrors. - Calls
IRREDUCIBLE_LOWERING.lower(fdef)to rewrite the function body. - Re-emits. The lowered body is plain structured C, so the second emit cannot produce goto errors.
Consequences:
- Functions whose structured emit succeeds pay no extra cost beyond five length-snapshots and one comparison. The fast path is the default.
- The classifier is the codegen. Whatever the structured emit cannot resolve, the retry rewrites — no separate analysis is needed, and there is no opportunity to misclassify a function.
--no-irreducible-loweringskips the retry. On affected functions, thegotoErrorsare surfaced as diagnostics and the compile fails.
An earlier version of the IIFE exposed a pre-classifier interface (needsLowering(funcDef) and a per-unit optimize(unit, options) driver) that walked the unit deciding which functions to rewrite. That interface was strictly worse than the codegen retry — it paid analysis cost on every function and risked misclassification — and was removed once the retry path was in place. The current public surface is just { lower }.
The lowering, end to end
IRREDUCIBLE_LOWERING.lower(funcDef) runs three sub-passes:
hoistDeclarations(funcDef)— move every local declaration to a flat list at function entry. Replace each in-place declaration with an assignment at the original site.buildSegments(rewrittenBody)— walk the AST and split it at every control-flow boundary into a flat array of segments (the AST equivalent of basic blocks), each ending in one of six terminators.synthesizeWrapper(funcDef, decls, segments)— build the new function body: hoisted decls, a state variable, and awhile (1) switch (__state)whose cases are the segments.
The output of pass 2 is plain structured C and feeds back through the existing codegen unchanged. No new codegen path is added; the lowering is purely AST→AST.
Sub-pass 1: hoist declarations
The reason hoisting is necessary: the state machine breaks every scope. Once a declaration’s lifetime is split across multiple case bodies of a switch, the original block-scoped storage no longer matches the visibility the program needs. The fix is to give every local declaration function-wide scope and keep its name unique.
hoistDeclarations produces { decls, rewrittenBody }:
- Walk the body. For each
SDeclcontainingDVars (skip static and extern locals), strip theinitExprand push theDVaronto the hoisted list. - If the original declaration had an initializer, emit an
SExprofname = initExprat the original site. The DVar is now plain, but the initialization still runs in source order. - Aggregate, array, and
EInitListinitializers are an exception — keep the originalinitExpron the hoistedDVar; the codegen runs them at the function-entry declaration point. - Maintain a
Set<string>of names seen so far. When a nested-scope DVar shadows an outer one, α-rename it (x→x__2) by mutatingDVar.name.EIdentnodes hold theDVarby reference, so every use site updates automatically.
Output invariant: every DVar in the function has a unique name within the function, lives in decls, and is initialized (if at all) by an explicit assignment at its original source position.
Sub-pass 2: build segments (the partial CFG)
This is the heart of the lowering. The output is a flat array of segments, where each segment is a maximal run of straight-line statements bounded by one terminator. Together they form a graph: terminators reference segments by integer id.
It is a “partial” CFG in the sense that it carries only the structure the rewrite needs — no dominance, no SSA, no def–use chains. Each segment knows what to execute and what to do next; nothing more.
Data structures
// Segment ≈ basic block. Sorted by id; id becomes the switch-case value.
function newSegment(id) {
return { id, stmts: [], term: null };
}
// One of six terminator shapes.
const Term = {
fallthrough: (nextId) => ({ kind: "fallthrough", nextId }),
goto: (targetId) => ({ kind: "goto", targetId }),
branch: (cond, thenId, elseId)=> ({ kind: "branch", cond, thenId, elseId }),
switchDispatch: (scrutinee, cases, defaultTarget)
=> ({ kind: "switch", scrutinee, cases, defaultTarget }),
return: (expr) => ({ kind: "return", expr }),
halt: () => ({ kind: "halt" }),
};
Six terminator kinds, all the function-local control-flow shapes that survive past the AST level: an unconditional jump (fallthrough/goto — same encoding, separate kinds for diagnostics), a two-way conditional (branch), an N-way dispatch (switch), a function exit (return), and a “this segment is unreachable / end of function” marker (halt).
The visitor
buildSegments opens the entry segment and recursively visits the rewritten body. State carried during the walk:
curSeg— the segment currently being filled, ornullif a terminator was just emitted.nextId— counter for fresh segment ids.labelToId—Map<SLabel, int>, so a forwardgoto Lcan request the id before the label is visited.loopStack— stack of{ breakId, continueId }records. The top entry resolvesSBreakandSContinue.caseIdsStack— stack ofMap<SCase, int>. EachSSwitchpre-allocates one id perSCasein its body and pushes the map; theSCasehandler consults the top map to find its own id. This works regardless of how deeply thecaselabel is nested inside compound statements.
The visitor’s job per node:
| AST node | What the visitor does |
|---|---|
SEmpty, SCompound | Nothing / recurse. |
SLabel(L) | Close current with fallthrough(L_id). Open L_id. |
SGoto(L) | Close current with goto(L_id). If unresolved, close with halt. |
SIf(cond, then, else?) | Allocate thenId, elseId (if else exists), joinId. Close with branch(cond, thenId, elseId ?? joinId). Visit each branch, falling through to joinId at the end. Open joinId. |
SWhile(cond, body) | Allocate hdrId, bodyId, exitId. Close current with fallthrough(hdrId). Open header, close with branch(cond, bodyId, exitId). Push {breakId: exitId, continueId: hdrId}, visit body, fall through to hdrId. Pop, open exitId. |
SDoWhile(body, cond) | Allocate bodyId, testId, exitId. Fall into body. Push {breakId: exitId, continueId: testId}, visit. Fall to testId. Test closes with branch(cond, bodyId, exitId). |
SFor(init, cond, inc, body) | Lower init into the current segment. Allocate hdrId, bodyId, contId, exitId. contId is the increment segment, not the header — that’s why for has a distinct continue target. continue jumps to contId, which contains inc and falls through to hdrId. |
SSwitch(expr, body) | Pre-allocate one id per SCase in body.caseBag; build dispatchCases and defaultTarget. Close current with switchDispatch(expr, cases, defaultTarget). Push {breakId: postId, continueId: null} — continueId: null is how the loopStack signals “switches don’t catch continue.” |
SCase(v) | Close with fallthrough(id_from_top_of_caseIdsStack). Open that id. |
SBreak | Close with goto(topBreakTarget()). |
SContinue | Close with goto(topLoop().continueId), where topLoop() walks the loopStack from the top, skipping entries with continueId === null (those are switches). |
SReturn(e) | Close with return(e). |
Other (SExpr, surviving SDecl, …) | Append to curSeg.stmts. |
After the walk, any still-open segment is closed with halt (function fall-through).
The loopStack is doing all the work for break/continue. A for loop pushes a different continueId than its breakId because they target different segments (the increment block vs. the exit). A switch pushes continueId: null so a continue inside a switch inside a loop walks past the switch entry and finds the enclosing loop’s continue target — matching C semantics exactly.
Shape of the output
For a tiny function with a goto, a while, an if, and a return, the segment graph might look like this:
The same six terminator kinds — and only those — flow into the wrapper synthesis. Any C control-flow construct that survives parsing has been reduced to a combination of straight-line statements plus one of these six edges.
Sub-pass 3: synthesize the wrapper
The output shape:
{
/* hoisted DVar declarations */
int __irreducible_state;
__irreducible_state = 0;
while (1) {
switch (__irreducible_state) {
case 0: /* seg 0 stmts */ /* terminator */
case 1: /* seg 1 stmts */ /* terminator */
/* … */
default: break; /* exit while on unexpected state */
}
}
}
Each segment becomes one case. Inside the case body, the segment’s stmts come first; then a translation of its terminator:
| Terminator | Statements emitted |
|---|---|
fallthrough(n) | __state = n; break; |
goto(n) | __state = n; break; |
branch(cond, t, e) | __state = (cond) ? t : e; break; |
switch(scrut, cases, def) | inner switch(scrut) whose body assigns __state for each case and default, then break; |
return(e) | return e; |
halt | return <zero of return type>; for non-void; return; for void |
The break in fallthrough/goto/branch/switch is the inner break — it exits the switch (__state) block. Control flows back to the while (1), which iterates and re-enters the switch on the new state.
There’s a small bit of footwork around the break keyword serving two roles. The wrapper uses an SBreak AST node (rendered as break; in C) inside each case body to exit the inner switch. That same SBreak would, if found outside the wrapper’s synthesized switch, target the enclosing while (1). The wrapper ensures the inner switch is always the immediately enclosing breakable, so every emitted break targets the switch as intended.
The halt terminator is the unreachable / fall-off-the-end case. It fires when:
- The function source body ends without an explicit
return(well-defined forvoid, undefined behavior for non-void in C — but the lowering still has to emit valid C). Returning a zero of the function’s return type is safe and predictable. - A goto target is unresolved, or some code path exits a segment without ever transferring control elsewhere. These cases shouldn’t reach runtime; emitting a return is a conservative placeholder.
For pointer returns the wrapper fabricates an integer zero and relies on the codegen’s existing implicit-cast machinery. For aggregate / struct returns it emits a bare return; with no value, accepting the UB — the codegen may diagnose.
Cost
Static:
- One segment per basic block boundary in the source body. Branching and loops fan out: an
if/elsecreates 3 segments (then,else,join), aforcreates 4 (hdr,body,cont,exit), ann-wayswitchcreatesn + 1. - One
caseper segment, onebr_tablearm per case, one nesting level ofblockper case (wasm’s standardswitchencoding).
Runtime:
- Every former jump is now a state-variable assignment plus an inner-switch break plus a
br_tabledispatch on the next iteration. Where the structured codegen would emit a singlebrof constant depth, the lowered version emits three operations plus a memory-resident state read. - Branch prediction degrades because every jump in the function now flows through the same
br_table.
Functions that hit pass 2 in practice: Duff’s device, hand-written state machines, irreducible control-flow graphs from generated code, computed gotos (not supported here, but would land in the same place). Idiomatic hand-written C using the common goto patterns (goto cleanup, goto retry, goto error) almost always stays on pass 1.
Limitations (from the source comment)
- VLAs are rejected at parse time; their hoisting would need dynamic alloca.
setjmp/longjmpinteraction inside an irreducible function is not supported. Structured-mode functions still work.- Computed
goto(GCC extension) is not supported generally.
References
- Implementation: the
IRREDUCIBLE_LOWERINGIIFE incompiler.js, sitting between theGOTO_NORMALIZERIIFE (the pass-1 normalizer) and the LEB128 encoding utilities used by the wasm emitter. Public surface:{ lower }. Invoked from the per-function retry block inside the codegen’semitFunctionBodydriver. Sub-passes are nested functions inside the IIFE:hoistDeclarations,buildSegments(with theTermfactory andnewSegmenthelper),synthesizeWrapper. - Pre-pass: the
GOTO_NORMALIZERIIFE in the same file. Article touches it only briefly; the hoist-and-plug algorithm there is its own subject. - Disable flag:
--no-irreducible-lowering, parsed in the compiler’s command-line option handler. - Prior art: Emscripten’s
relooper(replaced bystackifierin modern Emscripten). LLVM’sWebAssemblyCFGStackifypass. Cheerp’s structurizer. - Reducibility analysis: Hecht & Ullman, “Flow Graph Reducibility,” 1972.