ADR-0001: Strangler Fig Migration Protocol

Status: Accepted
Date: 2026-04-24
Deciders: Architect, PM Lead, Engineering Lead
Tags: migration, methodology

Context

Any migration of a large monolith to a new system takes multiple quarters. Teams that improvise the protocol end up with:

Inconsistent reconciliation standards across modules
No clear rollback path when things break
"How long is Stage 2?" answered differently per module
Stakeholder surprise ("I thought we shipped this already?")

A shared protocol with named stages and quantitative exit criteria solves all four.

Decision

Adopt the four-stage Strangler Fig protocol (see Concepts: Strangler Fig for narrative). Every module migration uses the same stage names, reconciliation thresholds, and rollback playbooks. Per-module overrides require an explicit note in the module's playbook.

The four stages

Stage 0 — Prep: schema + service + reconciliation tool ready in new system; no traffic yet. Feature flag migration.<module>.mode = off.
Stage 1 — Shadow Read: read traffic served from old; new system queried in parallel; diffs logged. Exit: < 0.1% diff rate × 7 days.
Stage 2 — Double Write: writes go to both systems; hourly reconciliation. Exit: < 0.01% diff × 14 days.
Stage 3 — Cutover: traffic to new, old stays hot via reverse-sync, rollback in < 5 min. Exit: KPIs steady × 14 days.
Stage 4 — Retire: old code frozen, tables read-only, 30-day observation (financial: through one monthly close; tax-regulated: 7-year retention).

Module overrides

Per-module playbook may override defaults:

Financial / regulated: tighten reconciliation tolerance 10× (< 0.001%); mandatory HITL on all writes; Stage 4 waits a full cycle.
High-volume, low-criticality: Stage 1 may shorten to 3 days if diff is zero; Stage 2 skippable if upstream is idempotent.

Consequences

Positive

Same vocabulary across modules; on-call and review are reusable.
Rollback is rehearsed per stage, not improvised during incidents.
Stakeholders can see migration progress in stage % terms.
Reconciliation job is a reusable pattern, not per-module invention.

Negative

Dual-write window costs 10–20% throughput on writes.
Per-module infrastructure cost increases ~30–50% during migration.
Stage-3 reverse-sync complexity is real; implementation takes planning.

Neutral

Some modules will finish in 6 weeks, others in 16. That's fine.
Legacy code can't be deleted on a predictable calendar — retire depends on business cycle.

Alternatives Considered

Alternative A: Shadow read → single big-bang cutover

Pros: simpler; no dual-write complexity.
Cons: write path unvalidated until cutover; cutover risk high for transactional modules.
Rejected: financial modules with heavy write load can't accept cutover risk.

Alternative B: Database-level CDC (e.g. Debezium)

Pros: application doesn't change.
Cons: only data-level; can't validate business-rule correctness on new side.
Rejected: the point of the migration is to improve business logic layer too; CDC alone doesn't cover it.

Alternative C: No protocol — team-by-team improvisation

Pros: maximum flexibility.
Cons: first failure mode that triggered this ADR.
Rejected.

References

Martin Fowler, "StranglerFigApplication"
Concepts: Strangler Fig
Templates: Module Playbook

Context​

Decision​

The four stages​

Module overrides​

Consequences​

Positive​

Negative​

Neutral​

Alternatives Considered​

Alternative A: Shadow read → single big-bang cutover​

Alternative B: Database-level CDC (e.g. Debezium)​

Alternative C: No protocol — team-by-team improvisation​

References​