How the Fict compiler works

2026-04-18

A source-accurate walkthrough of the compiler pipeline currently implemented in this repository.

Because compiler output changes over time, pair this article with the repository version or commit you are reviewing when using it as implementation documentation.

What This Article Is and Is Not

This article describes the implementation under packages/compiler/src as it exists today. It is not a cleaned-up idealized pipeline, and it intentionally distinguishes:

top-level stages that always run
analysis helpers that only run on some paths
representative IR snippets versus current emitted output

Unless otherwise noted, examples assume the source module imports from fict, so compiler-generated runtime imports also target the fict package family (for example fict/internal), not @fictjs/runtime/internal.

Compiler options materially affect behavior. In particular, fineGrainedDom, optimize, lazyConditional, getterCache, inlineDerivedMemos, resumable, strictReactivity, and strictGuarantee can change what gets emitted or whether compilation is allowed at all.

The Real Top-Level Flow

At the top level, the Babel plugin in packages/compiler/src/index.ts does this:

+---------------------------------------------------------------+
| 1. Collect macro imports, reactive import metadata, and       |
|    diagnostics context                                        |
+---------------------------------------------------------------+
        |
        v
+---------------------------------------------------------------+
| 2. Validate placement and usage of $state / $effect /         |
|    hook-like patterns                                         |
+---------------------------------------------------------------+
        |
        v
+---------------------------------------------------------------+
| 3. Run warning passes                                         |
|    (keys, spreads, dynamic access, control-flow guarantees,   |
|    ...)                                                       |
+---------------------------------------------------------------+
        |
        v
+----------------+
| 4. buildHIR()  |
+----------------+
        |
        v
+--------------------------------------------------+
| 5. optimizeHIR(program)                          |
|    optional, enabled by default                  |
+--------------------------------------------------+
        |
        v
+----------------------------------+
| 6. lowerHIRWithRegions(program)  |
+----------------------------------+
        |
        v
+---------------------------------------------------------------+
| 7. Strip macro imports and rebuild Babel scope state          |
+---------------------------------------------------------------+

That means the actual top-level pipeline is:

Top-level stage	Main module	What it does
Front-end validation	`index.ts`	macro import checks, placement rules, warnings, strict-guarantee enforcement
Build HIR	`ir/build-hir.ts`	Babel AST to HIR/CFG
Optional optimization	`ir/optimize.ts`	pure-function SSA optimization or reactive optimization
Lowering	`ir/codegen.ts`	scopes, regions, structurization, helper imports, Babel AST emission

ssa.ts, scopes.ts, regions.ts, and structurize.ts are important, but they are mostly invoked inside optimization or lowering rather than as separate top-level passes in index.ts.

Stage 0: Front-End Validation Before HIR Exists

Before the compiler builds HIR, index.ts performs a large amount of source-level validation and bookkeeping.

Macro import and placement checks

The plugin first discovers macro aliases imported from fict, fict/slim, and fict/plus, then enforces rules such as:

$state() must be imported from fict
$state() must be assigned directly to an identifier
$state() cannot appear inside loops, conditionals, or nested non-reactive functions
$effect() must be imported from fict
$effect() cannot appear inside loops, conditionals, or nested non-reactive functions

This matters because many invalid programs are rejected before HIR construction.

Warning and fail-closed behavior

Current defaults are intentionally strict. createFictPlugin() normalizes options so that:

optimize defaults to true
fineGrainedDom defaults to true
lazyConditional defaults to true
getterCache defaults to true
inlineDerivedMemos defaults to true
strictGuarantee defaults to true unless explicitly disabled

With strictGuarantee: true, non-guaranteed reactivity cases are escalated to hard errors rather than silently compiling with weaker behavior.

Cross-module metadata lookup starts here

index.ts also resolves previously-emitted module metadata for imports. That lets the compiler treat imported bindings as reactive when another module exported them as signal, memo, or store.

Stage 1: Build HIR

The first IR pass lives in packages/compiler/src/ir/build-hir.ts.

Why HIR exists

Fict does not try to reason directly on raw Babel AST after validation. Instead it builds a high-level IR that gives the compiler:

explicit basic blocks and terminators
a smaller, normalized expression set
preserved high-level constructs such as JSX, conditional expressions, optional chains, and template literals
a representation that both optimization and lowering can share

Destructuring is preprocessed first

Before statement-to-HIR conversion, build-hir.ts runs @babel/plugin-transform-destructuring and then rewrites object-rest helpers to Fict-specific forms such as __fictPropsRest.

That preprocessing is not an incidental detail. It is how the compiler preserves reactive props semantics while still operating on normalized assignments.

Core HIR types

The exact HIR types live in hir.ts. At a high level:

type Instruction =
  | { kind: 'Assign'; target: Identifier; value: Expression; declarationKind?: 'const' | 'let' | 'var' | 'function' }
  | { kind: 'Expression'; value: Expression }
  | { kind: 'Phi'; variable: string; target: Identifier; sources: { block: BlockId; id: Identifier }[] }

type Terminator =
  | { kind: 'Return'; argument?: Expression }
  | { kind: 'Throw'; argument: Expression }
  | { kind: 'Jump'; target: BlockId }
  | { kind: 'Branch'; test: Expression; consequent: BlockId; alternate: BlockId }
  | { kind: 'Switch'; discriminant: Expression; cases: { test?: Expression; target: BlockId }[] }
  | { kind: 'ForOf'; ... }
  | { kind: 'ForIn'; ... }
  | { kind: 'Try'; ... }

type Expression =
  | Identifier
  | Literal
  | CallExpression
  | MemberExpression
  | OptionalMemberExpression
  | BinaryExpression
  | LogicalExpression
  | ConditionalExpression
  | TemplateLiteral
  | JSXElementExpression
  | ArrowFunctionExpression
  | FunctionExpression
  | ...

HIRProgram also preserves:

preamble
postamble
originalBody

That preserved body is important later, because codegen tries to rebuild the final module while keeping original statement ordering stable.

Representative HIR for a simple component

Consider this source:

import { $state } from "fict";

function PriceTag({ price, currency }) {
  let discount = $state(0);
  const finalPrice = price - discount;
  const label = finalPrice > 100 ? "Premium" : "Standard";
  const formatted = `${currency} ${finalPrice.toFixed(2)}`;

  return (
    <div class={label === "Premium" ? "gold" : "silver"}>
      <h2>{label}</h2>
      <span>{formatted}</span>
      <button onClick={() => discount++}>Apply Discount</button>
    </div>
  );
}

Its HIR is conceptually a single basic block, because the function body contains no statement-level branching:

Block 0
  Assign discount   = CallExpression($state, [0])
  Assign finalPrice = BinaryExpression(price, '-', discount)
  Assign label      = ConditionalExpression(finalPrice > 100, 'Premium', 'Standard')
  Assign formatted  = TemplateLiteral(...)
  Return JSXElement(...)

Important nuance: the ternary is still just an expression. It does not create CFG branching on its own.

Stage 2: SSA and CFG Utilities

packages/compiler/src/ir/ssa.ts serves two related but distinct purposes:

enterSSA(program) performs full SSA conversion
analyzeCFG(blocks) computes reusable CFG facts

Those are not the same thing, and the current compiler uses them differently.

Full SSA conversion exists, but it is not a universal top-level stage

enterSSA() does the expected textbook work:

compute predecessors and successors
compute dominators and dominance frontiers
insert phi nodes
rename definitions and uses to SSA versions
eliminate redundant phi nodes

The naming scheme uses $$ssa:

1	makeSSAName("count", 2); // count$$ssa2

One implementation detail matters here: getSSABaseName() is intentionally conservative. It strips suffixes from compiler-generated SSA names, but it does not blindly rewrite every user identifier that happens to end in $$ssaN.

CFG analysis is used more broadly than SSA renaming

Even when the whole function is not renamed into SSA form, analyzeCFG() is reused by:

scopes.ts
structurize.ts
other control-flow-sensitive lowering decisions

So the correct mental model is:

SSA conversion is a targeted optimization tool
CFG analysis is a shared compiler utility

Representative phi-node example

If a function does enter SSA and contains statement-level branching, merge blocks can receive phi nodes:

function Example() {
  let x = $state(0);
  let label;
  if (x > 5) {
    label = "high";
  } else {
    label = "low";
  }
  return <span>{label}</span>;
}

Conceptually, the join block looks like:

1
2
3

Block 3
  Phi label$$ssa2 = φ(Block1: label$$ssa0, Block2: label$$ssa1)
  Return ...

That example is valid as an SSA illustration. It is not evidence that every reactive component takes a mandatory global “Enter SSA” pass before lowering.

Stage 3: Optimization

Optimization lives in packages/compiler/src/ir/optimize.ts, and the current compiler has two optimization paths.

Path A: pure-function optimization

If a function is considered a pure optimization candidate, the compiler explicitly enters SSA and runs a classic SSA-style pipeline:

ssaFn = enterSSA(...)
ssaFn = propagateConstants(ssaFn)
ssaFn = eliminateCommonSubexpressions(ssaFn)
ssaFn = inlineSingleUse(ssaFn)
ssaFn = eliminateDeadCode(ssaFn)
ssaFn = eliminatePhiNodes(ssaFn)

This is where the article-worthy “full SSA optimization pipeline” really exists today.

Path B: reactive optimization

Non-pure functions take the reactive path instead. This path currently:

analyzes reactive scopes with CFG-aware metadata
builds reactive and purity contexts
optimizes blocks locally
optionally propagates constants across blocks
performs cross-block common-subexpression elimination
inlines single-use derived memos when allowed
builds a reactive dependency graph
runs reactive dead-code elimination from observable roots

Crucially, this path is not “enter SSA, then do everything else.” It works on the existing HIR plus CFG/scope information.

Reactive graph, not just local algebra

For reactive code, the optimizer cares about observability, not only local expression simplification.

Observable roots include things such as:

returned values
bindings
effects
explicit memos
exported reactive values

That is why reactive DCE is graph-based rather than just “remove unused assignments.”

Stage 4: Reactive Scope Analysis

Reactive scope analysis lives in packages/compiler/src/ir/scopes.ts.

The base pass, analyzeReactiveScopes(fn), does this:

collect writes and reads for each definition-like scope
compute dependencies between scopes
find escaping variables from return values
mark scopes with external effects
compute a memoization heuristic
merge overlapping scopes
prune scopes that do not contribute to escaping or memoized results

The current scope model

The core shape is:

interface ReactiveScope {
  id: number;
  declarations: Set<string>;
  writes: Set<string>;
  reads: Set<string>;
  blocks: Set<number>;
  dependencies: Set<string>;
  dependencyPaths: Map<string, DependencyPath[]>;
  hasExternalEffect: boolean;
  shouldMemoize: boolean;
}

Optional-chain dependency paths

scopes.ts tracks dependency paths such as user?.profile?.name, not just base names. That feeds later subscription decisions.

The important guarantee is limited but useful:

the compiler can distinguish base-object dependency from property-path dependency
later passes can choose between whole-object and property-level subscriptions more precisely

SSA-enhanced scope analysis

The higher-level API used by lowering is analyzeReactiveScopesWithSSA(fn). Despite the name, it does not first run enterSSA().

Instead it returns:

the base scope analysis result
CFG analysis (loopHeaders, backEdges, dominator tree)
control-flow read analysis
loop-dependent scope metadata

That enhancement is enough for codegen decisions such as:

whether a dependency requires re-execution because it affects control flow
whether a scope is loop-dependent

Memoization heuristic is intentionally simple

Current shouldMemoizeScope() is heuristic, not an exact cost model. It mainly looks at:

whether the scope has dependencies
whether those dependencies come from other active scopes
whether the scope touches the entry block
whether it spans multiple blocks
whether it contributes to escaping values

It does not currently implement a rich “cheap expression” cost model.

Stage 5: Regions

Regions live in packages/compiler/src/ir/regions.ts.

Regions are the bridge between abstract scopes and concrete emitted code:

interface Region {
  id: number;
  scopeId: number;
  blocks: Set<BlockId>;
  instructions: Instruction[];
  dependencies: Set<string>;
  declarations: Set<string>;
  hasControlFlow: boolean;
  hasJSX: boolean;
  shouldMemoize: boolean;
  children: Region[];
  parentId?: number;
}

How regions are actually created

generateRegions(fn, scopeResult, shapeResult) creates regions only for scopes that are relevant to emitted reactivity:

scopes with hasExternalEffect
scopes with shouldMemoize

For each such scope, the compiler:

collects owned instructions
records whether the covered blocks contain control flow
records whether any owned expression contains JSX
computes dependencies
refines subscriptions using shape analysis from shapes.ts

Shape analysis matters here

regions.ts uses analyzeObjectShapes() plus helpers such as:

getPropertySubscription()
shouldUseWholeObjectSubscription()

So a region can subscribe to a narrower set like user.name or user.email when the shape analysis says that is safe, instead of always subscribing to all of user.

A key current implementation detail: internal region memos

Older conceptual descriptions often say “a derived variable becomes one memo.”

Current Fict output is a little more structured than that. Regions with outputs often compile into an internal memo that returns an object of outputs, for example:

const __region_0 = __fictUseMemo(
  __fictCtx,
  () => {
    const finalPrice = __fictUseMemo(
      __fictCtx,
      () => price() - discount(),
      { name: "finalPrice" },
      1000,
    );
    return { finalPrice };
  },
  { internal: true },
  1001,
);

That is a more accurate description of current output than the older “just one const finalPrice = useMemo(...)“ story.

Stage 6: Structurization

Structurization lives in packages/compiler/src/ir/structurize.ts.

Its job is to turn CFG-style blocks back into structured nodes such as:

if
while
for
forOf
forIn
switch
try
return
throw

It also has a fallback:

1	{ kind: 'stateMachine', blocks: ..., entryBlock: ... }

Important nuance: structurization is a utility, not always a separate top-level phase

The current compiler does not always run a clean standalone:

1	regions -> structurize -> codegen

Instead, structurization is used where lowering needs structured control flow:

region-based lowering
some pure-function lowering paths
fallback handling when CFG shape is awkward

Safety behavior

structurizeCFG() tracks:

depth limits
emitted blocks
problematic blocks
shared side-effect blocks
unreachable/unemitted reachable blocks

If needed, it can throw StructurizationError, or it can fall back to a state-machine node.

That fallback is real in the current implementation, so any rigorous description of Fict’s compiler should mention it.

Stage 7: Lowering and Code Generation

The main lowering entry point is lowerHIRWithRegions() in packages/compiler/src/ir/codegen.ts.

This is where several previously-separate concepts come together:

runtime import-family detection
imported reactive metadata
hook-return metadata
top-level statement segmentation
function lowering
scope analysis
region generation
structurization where needed
final Babel AST emission

The runtime import family follows the source module

If a module already imports from fict, generated helpers target fict/internal.

If a module instead lives in a lower-level runtime-only integration, generated helpers can target @fictjs/runtime/internal.

That choice is made by detectRuntimeImportFamily() in constants.ts.

Props are not lowered as raw `props.foo` reads

For component props, codegen often emits reactive accessors such as:

1 2	const price = prop(() => __props.price); const currency = prop(() => __props.currency);

So a precise article should talk about prop accessors, not just “props become getters somehow.”

Signals, memos, and effects depend on context

In component-like context, the compiler emits hook-aware helpers such as:

__fictUseContext
__fictUseSignal
__fictUseMemo
__fictUseEffect

At module level it can emit non-hook helpers such as:

createSignal
createMemo
createEffect

Hook slot numbering is explicit in the current implementation and starts from 1000.

Current Output Shape for `PriceTag`

Here is a representative excerpt from the current compiler output for the PriceTag example above:

import {
  __fictUseSignal,
  __fictUseMemo,
  template,
  resolvePath,
  getSlotEnd,
  insertBetween,
  createElement,
  addEventListener,
  bindClass,
  __fictUseContext,
  prop
} from "fict/internal";

function PriceTag(__props) {
  const price = prop(() => __props.price);
  const currency = prop(() => __props.currency);
  const __fictCtx = __fictUseContext();

  const discount = __fictUseSignal(__fictCtx, 0, { name: "discount" });

  const __region_0 = __fictUseMemo(__fictCtx, () => {
    const finalPrice = __fictUseMemo(
      __fictCtx,
      () => price() - discount(),
      { name: "finalPrice" },
      1000
    );
    return { finalPrice };
  }, { internal: true }, 1001);

  const { finalPrice } = __region_0();

  const __region_1 = __fictUseMemo(__fictCtx, () => {
    const label = __fictUseMemo(
      __fictCtx,
      () => finalPrice() > 100 ? "Premium" : "Standard",
      { name: "label" },
      1002
    );
    return { label };
  }, { internal: true }, 1003);

  const { label } = __region_1();
  const formatted = __fictUseMemo(
    __fictCtx,
    () => `${currency()} ${finalPrice().toFixed(2)}`,
    { name: "formatted" }
  );

  return __fictUseMemo(__fictCtx, () => {
    const __tmpl_1 = template("<div><h2><!--fict:slot:start--><!--fict:slot:end--></h2><span><!--fict:slot:start--><!--fict:slot:end--></span><button>Apply Discount</button></div>");
    // ...
    insertBetween(__el_2, __end_5, () => label(), createElement);
    insertBetween(__el_3, __end_6, () => formatted(), createElement);
    addEventListener(__el_4, "click", /* handler */, true);
    bindClass(__root_0, () => label() === "Premium" ? "gold" : "silver");
    return __root_0;
  }, 1004)();
}

Several details here are important because they correct older, less accurate explanations:

current output frequently introduces internal __region_* memos
child slots are often updated through insertBetween(...), not always bindText(...)
delegated events can compile to addEventListener(node, name, handler, true)
the DOM tree is located with resolvePath(...) and slot markers, not hand-written firstChild/nextSibling chains

JSX Lowering in the Current Compiler

The current compiler does not have one universal “JSX becomes bindText everywhere” rule. It chooses helpers based on the kind of child or binding.

Static template extraction

For fine-grained DOM output, the compiler builds HTML strings for static structure via template(...).

Template hoisting is opportunistic, not universal. In the current implementation it is especially used for list-render contexts to avoid repeated template parsing.

Dynamic child insertion

Dynamic child expressions often become:

resolvePath(...) to find the slot marker
getSlotEnd(...) to locate the slot boundary
insertBetween(...) to insert dynamic child content

That is why a child like {label} in an element body may not compile to a direct bindText call.

Direct text bindings still exist

When codegen has a direct text target, it may choose:

setText(...) for static or fused patch cases
bindText(...) for reactive text bindings

But that is only one lowering pattern among several.

Conditional children are specialized

Reactive conditional child expressions can lower to createConditional(...) plus onDestroy(...), rather than to a generic “re-run this block in useEffect“ wrapper.

Lists are specialized too

Recognized .map(...) JSX children can lower through the list helpers path (createKeyedList and related support). In the current tree, that logic lives primarily in codegen-list-child.ts, codegen-list-keys.ts, and the runtime list helpers.

Event Handling and Delegation

The event story in the current compiler is more precise than “it always emits bindEvent.”

Delegated events are recognized via the compiler’s DelegatedEvents set.
For delegated cases without special options, the compiler commonly emits addEventListener(node, name, handler, true).
In the runtime, that delegated form lazily ensures document-level delegation via delegateEvents(...).
For non-delegated events or events with options such as capture, once, or passive, the compiler falls back to bindEvent(...).

So a rigorous description must distinguish:

compile-time delegated emission
runtime delegated installation
non-delegated per-node listeners

Getter Caching

When enabled, getter caching allows repeated synchronous reads of the same accessor to be reused rather than re-invoked.

This is a lowering detail, not a semantic phase of the IR pipeline, but it meaningfully affects emitted code quality. It is controlled by the getterCache option and implemented through codegen-cache.ts.

Cross-Module Reactive Metadata

Cross-module reactive metadata is represented by:

interface ModuleReactiveMetadata {
  exports: Record<string, "signal" | "memo" | "store">;
  hooks?: Record<string, HookReturnInfoSerializable>;
}

That means the current JSON shape is:

{
  "exports": {
    "count": "signal",
    "doubled": "memo"
  }
}

not:

{
  "exports": {
    "count": { "kind": "signal" }
  }
}

Metadata emission modes are also more nuanced than “write a sidecar next to the file”:

true: emit adjacent sidecars
false: emit nothing
'auto': emit to the metadata cache directory when no external metadata store or resolver is supplied

The default cache directory is .fict-cache/metadata.

The Best Mental Model for the Current Compiler

If you want a model that matches the source tree well, use this one:

index.ts validates source-level rules and builds compiler context.
build-hir.ts converts Babel AST into HIR + CFG.
optimize.ts runs either:
- a true SSA pipeline for pure functions, or
- a reactive optimization pipeline for non-pure functions.
codegen.ts performs the heavy lowering work.
Inside lowering, the compiler invokes:
- scope analysis
- region generation
- structurization when structured control flow is needed
- final Babel AST emission

That is more accurate than treating SSA, scopes, regions, structurization, and codegen as seven always-separate top-level compiler stages.

Summary

The current Fict compiler is best understood as:

A strict source validator first
A HIR/CFG compiler second
A dual-path optimizer
A lowering engine that uses scopes, regions, and structurization as internal tools

HIR is central. CFG analysis is central. SSA is real, but targeted. Regions are real, but they are a lowering construct, not the public pipeline boundary. Structurization is real, but it is a utility the lowering path calls when needed, with a state-machine fallback when necessary.

That combination is what lets the compiler accept plain component code, enforce Fict’s reactivity guarantees, and emit fine-grained runtime code without pretending that the implementation is simpler or more linear than it actually is.