# Epic 7: Gap Analysis Against V10 Design Document

**Date:** 2026-03-12
**Scope:** Comparison of Epic 7 `design-spec.md` against `docs/designs/data-sync/V10.md`.
**Context:** Epic 7 is the only file in this folder (no tech-spec or jira-ticket). This analysis compares the single design spec against V10's orchestrator, merge, writer, error handling, and circular update prevention specifications.

---

## Overall Assessment

Epic 7 is the most architecturally comprehensive epic in the data-sync series. It correctly captures V10's core orchestrator concepts: five consumers (Merge Service, NetSuite Writer, SuiteX Writer, Reconciliation Service, Error/DLQ Handler), the three-way merge algorithm, per-field conflict policies driven by `field_metadata`, circular update prevention via six enforcement layers, UE/polling deduplication via fingerprinting, and the `current_state` materialized projection pattern.

The design spec is remarkably well-aligned with V10. However, there are gaps in the areas of: missing tech-spec and jira-ticket deliverables, the Feature Flag bypass switch for the NetSuite Writer, the coalescing/buffering window mechanics, error routing granularity, missing table DDLs, and several specification details that V10 prescribes but the design spec either omits or underspecifies.

---

## Gap 1: No Tech-Spec or Jira-Ticket (High — Completeness)

**V10 Reference:** All other epics (1, 2, 3, 4, 8) include a `tech-spec.md` with concrete code (classes, methods, DDL, directory layout) and a `jira-ticket.md` with structured deliverables and acceptance criteria.

**Current State:** Epic 7 has only a `design-spec.md`. Given the scope (five consumers, three-way merge, conflict resolution, circular update prevention, idempotency, governance throttling), this is by far the largest epic and needs implementation-level guidance the most.

**Impact:** Without a tech-spec, developers must infer class structures, file locations, database migrations, queue names, and service contracts from the design spec's prose. This increases the risk of inconsistencies with the namespace conventions established in prior epics (`src/Domain/Sync/`, `src/App/Services/Sync/`, `src/App/Jobs/Sync/`).

**Required Change:** Create `tech-spec.md` and `jira-ticket.md` covering at minimum:

1. **Service classes and directory layout:**
   - `src/App/Services/Sync/MergeService.php` — orchestrates the three-way merge
   - `src/App/Services/Sync/ConflictResolver.php` — per-field policy engine
   - `src/App/Services/Sync/WriteLedgerService.php` — circular update tracking
   - `src/App/Services/Sync/IdempotencyService.php` — key generation and enforcement
   - `src/App/Services/Sync/FingerprintService.php` — event deduplication
   - `src/App/Services/Sync/Writers/NetSuiteWriter.php` — governance-aware NetSuite writer
   - `src/App/Services/Sync/Writers/SuiteXWriter.php` — SuiteX tenant database writer

2. **Job classes:**
   - `src/App/Jobs/Sync/ProcessRawEvent.php` — consumes `events.raw`
   - `src/App/Jobs/Sync/ApplyMergedEvent.php` — consumes `events.merged`
   - `src/App/Jobs/Sync/ProcessErrorEvent.php` — consumes `events.error`

3. **Model classes:**
   - `src/Domain/Sync/Models/CurrentState.php`
   - `src/Domain/Sync/Models/Conflict.php`
   - `src/Domain/Sync/Models/WriteLedgerEntry.php`
   - `src/Domain/Sync/Models/IdempotencyKey.php`
   - `src/Domain/Sync/Models/RecordLock.php`

4. **Database migrations** for all tables listed in V10's storage schema alignment (see Gap 6).

5. **Jira-ticket** breaking down Stages 3-4 into implementable subtasks.

**Affected Files:** New files needed: `tech-spec.md`, `jira-ticket.md`.

---

## Gap 2: Missing Feature Flag Bypass Switch for NetSuite Writer (High)

**V10 Reference:** Lines 463-465:

> **Feature Flag / Bypass Switch:** The NetSuite writer consumer utilizes a feature flag matching the event's `accountId` against a `tenant_ids` enabled list.
> - **If Bypass is ON (Account not enabled):** Consumer reads the event, updates SuiteX `current_state` projection, and updates the `write_ledger` with `writeId` to maintain idempotency, but **intentionally skips** the outbound HTTP request to NetSuite.
> - **If Bypass is OFF (Account enabled):** Consumer executes the outbound NetSuite RESTlet call normally.

V10 changelog (line 3676): "Added logic requiring the NetSuite Writer consumer to check a progressive rollout feature flag."

**Current State:** Epic 7's design spec does not mention the Feature Flag bypass switch in the functional requirements or the NetSuite Writer description (requirement item 2, line 16). It is only briefly referenced in the Stage 6 deliverables section (line 158): "Feature flag bypass switch for progressive per-account rollout."

**Impact:** Without this, the NetSuite Writer has no mechanism for progressive rollout. Enabling the Orchestrator would immediately start writing to NetSuite for all tenants, which V10 explicitly requires to be controlled per-account. This is also critical for the Shadow Event migration strategy — shadow events from Epic 4 should update `current_state` without triggering NetSuite writes until the account is explicitly enabled.

**Required Change:** Add to the functional requirements section (between E and the Architectural Boundaries):

> **F. Feature Flag Bypass Switch (NetSuite Writer):**
>
> Before issuing any outbound NetSuite write, the NetSuite Writer must check the `shadow_sync` feature flag (or a dedicated `netsuite_writer_enabled` flag) against the event's `accountId`. If the account is not in the enabled list:
> - Update `current_state` projection
> - Update `write_ledger` with the `writeId`
> - ACK the Pub/Sub message
> - **Skip** the outbound HTTP request
>
> This allows safe state-building from shadow events during the Strangler Fig migration without creating duplicate NetSuite records.

Also move from Stage 6 to Stage 3 deliverables (it is needed before the writer can be deployed).

**Affected Files:** `design-spec.md` (functional requirements, Stage 3 deliverables).

---

## Gap 3: Coalescing/Buffering Window Mechanics Not Specified (High)

**V10 Reference:** Lines 374-378:

> The merge service buffers a short time-window of UE and polling events per record and coalesces them based on:
> - `lastModifiedDate` (primary)
> - `internalId` (tiebreaker for same-second modifications)
> Within this window, SuiteX applies the latest NetSuite state once, rather than each intermediate snapshot.

Appendix F1 (lines 986-1001) — pseudocode shows `readPendingChanges(key)`, `writePendingChanges(key, merged)`, and a `shouldFlush(key)` condition.

**Current State:** Epic 7 mentions "coalescing events within a short time window" (line 15) but does not specify:
- How long the window is (e.g., 5-10 seconds as implied by V10)
- Where pending changes are buffered (in-memory, Redis, or database)
- What the flush conditions are (idle time, count threshold, explicit flag)
- How the buffer interacts with Pub/Sub message ACK deadlines

This is critical because the spec also says workers must be "entirely stateless" (line 79). A coalescing buffer held in-memory contradicts that constraint. V10 line 452 says "in-memory or Redis-backed buffer" — the spec should clarify which approach.

**Impact:** Without specifying the buffering mechanism, the implementation may either skip coalescing entirely (processing every event individually, increasing NetSuite API load) or implement an in-memory buffer that violates the statelessness requirement and loses events on worker restart.

**Required Change:** Add a dedicated subsection under Functional Requirements:

> **Coalescing Window:**
> - Buffer implementation: Redis-backed (not in-memory) to maintain statelessness
> - Key: `(accountId, recordType, recordId)`
> - Window duration: configurable, default 5 seconds
> - Flush conditions: (a) window timer elapses with no new events, (b) event count exceeds threshold (e.g., 10), (c) explicit flush flag on event
> - On flush: merge all buffered events for the key, emit a single `MergedEvent`
> - Pub/Sub ACK: extend deadline while buffering; ACK all constituent messages on flush
> - On worker crash: Redis buffer persists; another worker picks up and flushes

**Affected Files:** `design-spec.md` (functional requirements).

---

## Gap 4: Error Routing Granularity Missing (Medium)

**V10 Reference:** V10 Appendix O6 (lines 2385-2397) defines a granular retry policy table:

| Error Class | Max Retries | Backoff | Queue |
|---|---|---|---|
| Transient | 10 | Exponential (1s to 5m) | Retry queue |
| 429 Rate Limit | 5 | 60s, 5m, 15m, 1h, 4h | Throttle queue |
| Governance Exceeded | 1 | Wait until next day | Governance queue |
| Validation Error | 0 | None | Error queue (human review) |
| Auth Error | 3 | 5m (allow token refresh) | Auth error queue |
| Conflict | 0 | None | Conflict queue (human review) |

V10 also specifies (line 2394-2397): after max retries, move to dead-letter queue.

**Current State:** Epic 7 describes error classes (line 46: "If validation fails, route immediately to `events.dlq`") and the five consumers (including Error/DLQ Handler), but does not specify:
- The per-error-class retry policy table
- The distinction between different queue types (retry, throttle, governance, auth error)
- Max retry counts per class
- Backoff schedules
- The transition from retry queues to DLQ after exhaustion

**Impact:** Without this, the Error/DLQ Handler has no specification for how to triage errors. Developers might implement a single retry-or-DLQ binary decision instead of the nuanced six-class routing V10 requires.

**Required Change:** Add V10's retry policy table verbatim to the functional requirements, and specify which Pub/Sub topic each queue maps to:
- `events.error` for retry-able errors (Transient, 429, Governance, Auth)
- `events.dlq` for exhausted retries and permanent errors
- `conflicts` table for Conflict errors (not a Pub/Sub topic)
- `sync_error_queue` table for Validation errors requiring human review

**Affected Files:** `design-spec.md` (functional requirements section E, or a new section F).

---

## Gap 5: `current_state` Table DDL and Versioning Semantics Not Defined (Medium)

**V10 Reference:** Line 415: "`current_state`: Authoritative projection per `(account_id, record_type, record_id)` with `version`, `state`, and `last_modified`." Line 57: "Maintain `current_state` projection per record (versioned). This is authoritative for the sync service."

**Current State:** Epic 7 references `current_state` extensively but provides no DDL, no versioning semantics (how version increments, what happens on concurrent updates), and no description of the `state` column format (full record snapshot? field map?).

**Impact:** This is the most critical table in the entire sync architecture — the three-way merge reads Base from it, and all writers update it. Without a spec, implementations may diverge on schema, version type (integer vs bigint vs timestamp), and concurrency handling.

**Required Change:** Define the `current_state` DDL:

```sql
CREATE TABLE current_state (
    account_id BIGINT UNSIGNED NOT NULL,
    record_type VARCHAR(100) NOT NULL,
    record_id VARCHAR(255) NOT NULL,
    version BIGINT UNSIGNED NOT NULL DEFAULT 1,
    state JSON NOT NULL COMMENT 'Full current field-value map for this record',
    last_modified TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    last_event_id CHAR(36) NULL COMMENT 'UUID of the last event that updated this projection',
    PRIMARY KEY (account_id, record_type, record_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
```

Versioning semantics:
- `version` is a monotonically increasing bigint, incremented by 1 on every successful merge apply
- Concurrent writers use `WHERE version = ?` for optimistic locking
- `state` is a full JSON map of the record's current field values (not just deltas)

**Affected Files:** `design-spec.md` (or new `tech-spec.md`).

---

## Gap 6: Missing DDL for `write_ledger`, `idempotency_keys`, and `record_lock` Tables (Medium)

**V10 Reference:** Lines 420-423 define these tables as part of the storage schema alignment. The `write_ledger` (line 423) records `(account_id, record_type, record_id, field)` with `write_id`, `source_system`, `write_timestamp`. The `idempotency_keys` (line 420) stores `(target, idempotency_key, event_id, first_seen_at, last_seen_at, status, last_result)`. The `record_lock` (line 421) stores record-level locks.

**Current State:** Epic 7 references all three tables by name and describes their purpose, but provides no DDL. V10's Appendix D ERD (lines 956-962) shows `record_lock` with columns: `record_type`, `record_id`, `locked`, `locked_at`, `reason`.

**Impact:** Without DDL, developers must infer column types, indexes, and constraints. This is especially risky for `write_ledger`, which is queried on every incoming event for circular update suppression — it needs proper indexing for performance.

**Required Change:** Define DDLs (in tech-spec when created):

```sql
CREATE TABLE write_ledger (
    id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    account_id BIGINT UNSIGNED NOT NULL,
    record_type VARCHAR(100) NOT NULL,
    record_id VARCHAR(255) NOT NULL,
    field VARCHAR(255) NOT NULL,
    last_write_id CHAR(36) NOT NULL,
    last_write_source VARCHAR(50) NOT NULL,
    last_write_timestamp TIMESTAMP(6) NOT NULL,
    UNIQUE KEY uk_record_field (account_id, record_type, record_id, field),
    INDEX idx_write_id (last_write_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE idempotency_keys (
    id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    target VARCHAR(50) NOT NULL COMMENT 'suitex or netsuite',
    idempotency_key VARCHAR(500) NOT NULL,
    event_id CHAR(36) NOT NULL,
    first_seen_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    last_seen_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    status ENUM('pending', 'completed', 'failed_permanent') NOT NULL DEFAULT 'pending',
    last_result JSON NULL,
    UNIQUE KEY uk_target_key (target, idempotency_key),
    INDEX idx_event_id (event_id),
    INDEX idx_first_seen (first_seen_at)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE record_lock (
    id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    account_id BIGINT UNSIGNED NOT NULL,
    record_type VARCHAR(100) NOT NULL,
    record_id VARCHAR(255) NOT NULL,
    locked BOOLEAN NOT NULL DEFAULT TRUE,
    locked_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    reason VARCHAR(255) NOT NULL,
    conflict_id CHAR(36) NULL COMMENT 'FK to conflicts table if lock is due to conflict',
    UNIQUE KEY uk_record (account_id, record_type, record_id),
    INDEX idx_locked (locked)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
```

Also specify retention policy for `idempotency_keys`: minimum 30 days per V10 line 318.

**Affected Files:** New `tech-spec.md`.

---

## Gap 7: Merge Service Described as Consuming `events.raw` But Also Fetching `RemoteCurrent` (Medium — Architectural Clarity)

**V10 Reference:** V10's architecture separates concerns between consumers:
- **Merge Service** (line 976): "normalizes and produces canonical merged events" — reads `events.raw`, coalesces, deduplicates, publishes to `events.merged`
- **NetSuite Writer** (line 977): "governance-aware writing" — reads `events.merged`, fetches RemoteCurrent from NetSuite, applies merge, writes
- **SuiteX Writer** (line 978): "apply NS → SX updates" — reads `events.merged`, applies to SuiteX DB

The Merge Orchestrator pseudocode (Appendix F1, lines 986-1001) shows the Merge Service doing coalescing and emitting `MergedEvent`, while the NetSuite Writer (Appendix F2, lines 1006-1019) fetches remote state and applies.

**Current State:** Epic 7's merge algorithm description (lines 23-28) places the three-way merge (including fetching `RemoteCurrent` from the target system) inside the Merge Service. But V10's architecture suggests the Merge Service handles normalization, deduplication, and coalescing, while the actual three-way merge against RemoteCurrent happens in the writers.

This is a significant architectural question: does the three-way merge happen in the Merge Service or in the Writers?

**Impact:** If the Merge Service fetches RemoteCurrent from NetSuite for every event, it becomes tightly coupled to NetSuite's API and subject to governance limits, defeating the purpose of separating merge from write. If the Writers do the merge, the Merge Service can remain a lightweight normalization/coalescing layer.

**Required Change:** Clarify the responsibility split. The recommended architecture (aligned with V10's consumer map and pseudocode) is:

1. **Merge Service** (`events.raw` → `events.merged`): Validate, record to `events` table, deduplicate (fingerprint), coalesce (buffer window), suppress circular updates (write ledger), and emit a normalized `MergedEvent` with the changes. Does NOT fetch RemoteCurrent.

2. **NetSuite Writer** (`events.merged`): For SuiteX-originated events requiring NetSuite write — fetch RemoteCurrent from NetSuite, perform the three-way merge against `current_state` Base, apply conflict policies, and write to NetSuite if merge succeeds.

3. **SuiteX Writer** (`events.merged`): For NetSuite-originated events requiring SuiteX update — apply changes to SuiteX tenant DB and update `current_state`.

Document this split explicitly in the functional requirements.

**Affected Files:** `design-spec.md` (architectural solution and functional requirements).

---

## Gap 8: No Escalation Policy for Unresolved Conflicts (Medium)

**V10 Reference:** Line 537:

> If a conflict remains unresolved beyond the human SLA window (e.g., 24 hours), escalate to a higher authority and optionally pause automated writes for that account or mark the field(s) read-only until resolved.

Lines 540-543 describe handling simultaneous changes during conflict resolution, including granular field-level locking.

**Current State:** Epic 7 mentions conflict queuing (Scenario 5, line 125) and record locking but does not specify:
- Conflict SLA window (how long before escalation)
- Escalation actions (notify higher authority, pause writes, mark fields read-only)
- Behavior when new events arrive for a locked record (V10 says accept and append, but block auto-apply)
- Granular field-level locking option (V10 line 543: "auto-resolve low-risk fields while blocking only high-risk fields")

**Required Change:** Add to Functional Requirements:

> **Conflict Escalation Policy:**
> - Conflicts unresolved after 24 hours (configurable SLA) trigger escalation notification
> - Escalation options: notify senior admin, pause all automated writes for the account, or mark conflicted fields as read-only
> - While a record is locked: new events are accepted into `events` table and appended to the event stream, but blocked from automated apply until conflict is resolved
> - Optional granular field-level locking: auto-resolve low-risk fields while blocking only the conflicted high-risk fields

**Affected Files:** `design-spec.md` (functional requirements or acceptance criteria).

---

## Gap 9: Reconciliation Service Underspecified (Medium)

**V10 Reference:** Lines 546-548:

> Periodic full reconciliation job per account or record type (e.g., nightly) that diffs authoritative NetSuite state vs `current_state` projection and emits reconciliation events.
> Provide a customer-initiated resync process and an admin backfill tool to repair projection mismatches.

V10 line 355-357: Reconciliation events have `source = reconciliation` and have **highest priority** when they indicate drift.

**Current State:** Epic 7 describes the Reconciliation Service as consumer #4 (line 18): "runs in background, detects drift between `current_state` and actual NetSuite state, emits corrective events." This is accurate but highly compressed. The spec provides:
- No schedule (nightly? hourly? configurable?)
- No mechanism (full scan? sampling? delta?)
- No scope definition (per account? per record type? configurable?)
- No trigger mechanism (automatic + manual resync)
- No description of what a "corrective event" looks like (full snapshot? delta?)
- No mention of the `source = reconciliation` priority override

**Required Change:** Expand the Reconciliation Service description:

> **Reconciliation Service:**
> - Schedule: Nightly per account, per record type (configurable via `sync_watermark`)
> - Mechanism: Fetch full record list from NetSuite via polling (respecting governance), diff against `current_state` projections
> - Output: Emit `ChangeEvent` with `source = 'reconciliation'` and `operation = 'update'` for each drifted record
> - Priority: Reconciliation events override prior assumptions when drift is detected; still respect `field_metadata.conflict_policy` for critical fields
> - Manual trigger: Provide admin API endpoint to initiate resync for a specific account/record type
> - Backfill: Support bulk re-import of historical NetSuite state into `current_state`

**Affected Files:** `design-spec.md` (functional requirements or a new subsection).

---

## Gap 10: `operation` vs `eventType` Terminology (Low)

**V10 Reference:** Per the Epic 1 gap analysis, V10's canonical schema uses `operation` (not `eventType`). V10 Appendix A4 (line 672): `"operation": "create" | "update" | "delete"`.

**Current State:** Epic 7 consistently uses `eventType` in its prose (not in code, since there's no tech-spec). This is not a functional gap since the design spec is descriptive prose, but it creates inconsistency with the corrected canonical schema from Epic 1's gap analysis.

**Required Change:** When the tech-spec is created, ensure all references use `operation` (not `eventType`) to match the corrected canonical envelope. Optionally update the design-spec prose for consistency.

**Affected Files:** `design-spec.md` (prose references).

---

## Gap 11: Stage Mapping Differs from V10 (Low — Organizational)

**V10 Reference:** V10 defines six stages (0-6). Stages 1-2 are infrastructure (event model, projections). Stage 3 is the orchestrator + merge. Stage 4 is backpressure/batching. Stage 5 is Conflict UI. Stage 6 is observability/cutover.

**Current State:** Epic 7's "Deliverables (Staged per V10)" section (lines 127-159) maps its deliverables to these stages, which is good. However:
- Stage 1 includes "Producers: UE emitter, Polling emitter, SuiteX emitter" — but these are covered by other epics (Epic 4 for SuiteX, Epic 5/6 for NetSuite). Listing them here creates ambiguity about ownership.
- Stage 5 (Conflict UI) overlaps entirely with Epic 8's gap analysis Gap 1. It's unclear whether Epic 7 or Epic 8 owns this deliverable.
- Stage 6 includes "Shadow event emission for legacy API coexistence" which is Epic 4's responsibility.

**Impact:** Minor organizational confusion. No functional impact, but could lead to duplicated work or dropped deliverables.

**Required Change:** Add a cross-reference note clarifying epic ownership:

> **Note on Stage Ownership:**
> - Stage 1 producers: SuiteX emitter (Epic 4), UE emitter (Epic 5), Polling emitter (Epic 6). Epic 7 is responsible for the Merge Service consumer, not the producers.
> - Stage 5 Conflict UI: Frontend owned by Epic 8; backend APIs and conflict table management owned by Epic 7.
> - Stage 6 shadow events: Owned by Epic 4. Epic 7 provides the Feature Flag bypass switch that shadow events depend on.

**Affected Files:** `design-spec.md` (Deliverables section).

---

## Gap 12: No Mention of `sync_error_queue` Table (Low)

**V10 Reference:** Line 422: "`sync_error_queue`: stores errors requiring human intervention, linked to `record_lock` entries and conflict records."

**Current State:** Epic 7 describes the Error/DLQ Handler (consumer #5) but does not reference `sync_error_queue` by name. It mentions routing to `events.error` and `events.dlq` topics but not the persistent database table that stores errors for the DLQ dashboard (Epic 8).

**Impact:** Without referencing `sync_error_queue`, the Error/DLQ Handler implementation might only interact with Pub/Sub topics and not persist errors to the database table that Epic 8's dashboard reads from.

**Required Change:** Add to Functional Requirements:

> When routing to error or DLQ, the Error Handler must also persist the error to the `sync_error_queue` table (see Epic 8) with: `tenant_id`, `record_type`, `record_id`, `error_class`, `reason`, `details` (including the full canonical payload), and `status`. If the error requires record-level locking, create a corresponding `record_lock` entry and link it via `record_lock_id`.

**Affected Files:** `design-spec.md` (functional requirements section E or Error/DLQ Handler description).

---

## Summary of Required Changes

| Gap | Severity | Change Type |
|-----|----------|-------------|
| 1. No tech-spec or jira-ticket | **High** | Create new files with class structures, DDL, directory layout |
| 2. Missing Feature Flag bypass switch | **High** | Add to functional requirements + move to Stage 3 |
| 3. Coalescing window not specified | **High** | Define buffer mechanism, duration, flush conditions |
| 4. Error routing granularity missing | **Medium** | Add V10's retry policy table and queue mapping |
| 5. `current_state` DDL missing | **Medium** | Define DDL with versioning semantics |
| 6. `write_ledger`, `idempotency_keys`, `record_lock` DDL missing | **Medium** | Define DDLs with indexes and retention |
| 7. Merge vs Writer responsibility split unclear | **Medium** | Clarify which component fetches RemoteCurrent |
| 8. No escalation policy for unresolved conflicts | **Medium** | Add SLA window, escalation actions, field-level locking |
| 9. Reconciliation Service underspecified | **Medium** | Add schedule, mechanism, priority, manual trigger |
| 10. `operation` vs `eventType` terminology | **Low** | Align with corrected canonical schema |
| 11. Stage mapping ownership ambiguity | **Low** | Add cross-reference notes for epic ownership |
| 12. Missing `sync_error_queue` reference | **Low** | Add error persistence to database table |

---

## Architectural Recommendation: Merge vs Writer Responsibility Split

Given the significance of Gap 7, here is the recommended architecture aligned with V10's consumer map:

```
events.raw
    │
    ▼
┌──────────────────────┐
│    Merge Service      │  Subscribes to: events.raw
│                       │  Responsibilities:
│  1. Validate schema   │  - JSON Schema validation
│  2. Persist to events │  - Immutable audit write
│  3. Record fingerprint│  - event_audit_log
│  4. Deduplicate       │  - Fingerprint-based UE/poll arbitration
│  5. Check write_ledger│  - Circular update suppression
│  6. Coalesce (buffer) │  - Redis-backed per-record buffer
│  7. Emit MergedEvent  │  - Publish to events.merged
│                       │
│  Does NOT fetch       │
│  RemoteCurrent        │
└──────────┬────────────┘
           │
           ▼
      events.merged
      ┌────┴────┐
      │         │
      ▼         ▼
┌───────────┐ ┌───────────┐
│  NetSuite │ │  SuiteX   │
│  Writer   │ │  Writer   │
│           │ │           │
│ Fetch     │ │ Apply to  │
│ Remote    │ │ tenant DB │
│ Current   │ │           │
│ 3-way     │ │ Update    │
│ merge     │ │ current_  │
│ Write to  │ │ state     │
│ NetSuite  │ │           │
│ Update    │ │ Update    │
│ current_  │ │ write_    │
│ state     │ │ ledger    │
└───────────┘ └───────────┘
```

This keeps the Merge Service lightweight and decoupled from NetSuite API governance, while the Writers handle the expensive remote state fetching and conflict resolution. The three-way merge logic is shared (via `ConflictResolver` service) but invoked by the Writers, not the Merge Service.
