# Epic 2: Gap Analysis Against V10 Design Document

**Date:** 2026-03-12
**Scope:** Comparison of Epic 2 files (`design-spec.md`, `jira-ticket.md`) against `docs/designs/data-sync/V10.md`.

---

## Overall Assessment

Epic 2 correctly identifies the need for a dedicated Cloud SQL instance to isolate the event store from the tenant database, accurately selects Google Cloud Pub/Sub as the backbone, and establishes the right write-isolation and immutability constraints. The CQRS separation between the web application (publish only) and background workers (write to event store) is consistent with V10's architecture.

However, the event store schema has significant column-level mismatches against V10's `events` table DDL, the Pub/Sub topic list is missing one topic, environment isolation for topics is absent, a companion table (`event_audit_log`) required for deduplication is missing entirely, and the Eloquent model placement violates the project's domain-driven directory structure.

---

## Gap 1: Missing `events.error` Pub/Sub Topic (High)

**V10 Reference:** Appendix E (lines 965-972) defines four topics:
- `events.raw` — UE, polling, SuiteX producers
- `events.merged` — output of merge service
- `events.error` — retry-able errors
- `events.dlq` — dead-letter

**Current State:** Epic 2 provisions only three topics: `events.raw`, `events.merged`, `events.dlq`.

**Impact:** V10 distinguishes between retry-able errors (`events.error`) and permanent dead-letter errors (`events.dlq`). Without `events.error`, transient failures (429s, timeouts, 5xx responses) would be lumped into the DLQ alongside permanent failures (invalid payloads, auth errors). This breaks V10's error classification taxonomy (Appendix G) and the retry policy (Appendix O6), which routes different error classes to different queues with different retry strategies.

**Required Change:** Add `events.error` as a fourth topic with its own subscription. Configure `events.error` with retry semantics (exponential backoff, max attempts before forwarding to `events.dlq`). The DLQ should only receive events that have exhausted all retry attempts or are classified as permanent errors.

**Affected Files:** `design-spec.md` (section A), `jira-ticket.md` (Pub/Sub Topics).

---

## Gap 2: Event Store Schema Column Mismatches (Critical)

**V10 Reference:** The `events` table DDL (lines 240-251):

```sql
CREATE TABLE events (
  id uuid primary key,
  account_id text,
  record_type text,
  record_id text,
  source text,
  timestamp timestamptz,
  base_version bigint,
  changes jsonb,
  full_snapshot_ref text,
  processed boolean default false
);
```

**Current State:** Epic 2 defines the following columns:
- `id` (auto-increment)
- `account_id` (string)
- `event_id` (UUID unique)
- `aggregate_type` (string)
- `aggregate_id` (string)
- `payload` (JSON)
- `created_at` (timestamp)

**Issues:**

### 2a. Column naming: `aggregate_type`/`aggregate_id` vs `record_type`/`record_id`

Epic 2 uses DDD terminology (`aggregate_type`, `aggregate_id`). V10 consistently uses `record_type` and `record_id` across the entire document -- the canonical envelope, the events table, the current_state table, the conflicts table, the event_audit_log, the sync_watermark, and the field_metadata table. Using different column names creates a mapping disconnect between the database schema and every other component in the system.

**Required Change:** Rename to `record_type` and `record_id`.

### 2b. Missing columns

Epic 2 omits six columns that V10 requires:

| V10 Column | Type | Purpose |
|---|---|---|
| `source` | text | Which producer emitted the event (`netsuite-ue`, `netsuite-poll`, `suitex`, `reconciliation`). Required for UE vs polling arbitration and source-priority rules. |
| `timestamp` | timestamp | When the change occurred in the source system. Distinct from `created_at` (when the row was inserted into the event store). V10 uses this for time-based ordering and coalescing. |
| `base_version` | bigint | The `current_state` projection version the event was derived from. Required for three-way merge base resolution. |
| `changes` | JSON | Field-level deltas. Epic 2 uses `payload` as a single blob, which conflates deltas with full snapshots and makes field-level querying difficult. |
| `full_snapshot_ref` | text | GCS URI pointing to full state backup. Stored separately from `changes` so the merge service can distinguish delta events from snapshot fallbacks. |
| `processed` | boolean | Whether the event has been consumed by the merge service. Required for replay, reconciliation, and backfill operations. |

### 2c. `payload` vs separate `changes` + `full_snapshot_ref`

Epic 2's single `payload` JSON column conflates two distinct concepts that V10 keeps separate. The merge service needs to know whether an event carries field-level deltas (in `changes`) or a pointer to a full snapshot (in `full_snapshot_ref`) without parsing the JSON blob. These should be separate columns.

### 2d. Primary key type

Epic 2 uses auto-increment `id` with a separate `event_id` UUID column. V10 uses UUID as the primary key directly (`id uuid primary key`). Using UUID as the PK ensures globally unique event identifiers across all producers without coordination. The auto-increment `id` adds nothing -- the `event_id` UUID is the natural primary key.

**Required Change:** Replace both `id` (auto-increment) and `event_id` (UUID) with a single `id` UUID primary key. Add all missing columns. Rename `payload` to `changes` and add `full_snapshot_ref` as a separate column.

Corrected DDL (MySQL 8.x compatible):

```sql
CREATE TABLE events (
    id CHAR(36) NOT NULL PRIMARY KEY,
    account_id VARCHAR(255) NOT NULL,
    record_type VARCHAR(255) NOT NULL,
    record_id VARCHAR(255) NOT NULL,
    source VARCHAR(50) NOT NULL,
    `timestamp` TIMESTAMP NOT NULL,
    base_version BIGINT UNSIGNED DEFAULT NULL,
    changes JSON NOT NULL,
    full_snapshot_ref TEXT DEFAULT NULL,
    processed TINYINT(1) NOT NULL DEFAULT 0,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_events_aggregate (account_id, record_type, record_id),
    INDEX idx_events_unprocessed (processed, created_at)
);
```

**Affected Files:** `design-spec.md` (section C), `jira-ticket.md` (schema section).

---

## Gap 3: Missing `event_audit_log` Table (High)

**V10 Reference:** Appendix D (lines 937-944) and the storage schema alignment (line 417):

> `event_audit_log`: stores event fingerprints, source, payload references, and created timestamps; used for deduplication, ordering verification, and auditing UE vs polling arbitration.

The durable merge model (lines 329-337) describes how fingerprints in the audit log are used for canonical event identity and UE/polling deduplication:

> Both UE and polling producers compute fingerprints based on `(recordType, recordId, lastModifiedDate, operation)`. When a polling event arrives whose fingerprint already exists in `event_audit_log` from a previous UE event, SuiteX marks the polling event as a confirmation.

**Current State:** Epic 2 only provisions the `events` table. The `event_audit_log` is not mentioned.

**Impact:** Without the audit log, there is no mechanism for fingerprint-based deduplication between UE and polling events. This is a core requirement of the merge pipeline -- without it, every polling event that overlaps with a UE event will produce duplicate state changes.

**Required Change:** Add the `event_audit_log` table to the Epic 2 migration since it is tightly coupled to the event store (it records metadata about events stored in the same database). It should be co-located on the dedicated instance.

```sql
CREATE TABLE event_audit_log (
    id CHAR(36) NOT NULL PRIMARY KEY,
    event_id CHAR(36) NOT NULL,
    record_type VARCHAR(255) NOT NULL,
    record_id VARCHAR(255) NOT NULL,
    source VARCHAR(50) NOT NULL,
    fingerprint CHAR(64) NOT NULL,
    processing_outcome VARCHAR(50) DEFAULT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_audit_fingerprint (fingerprint),
    INDEX idx_audit_record (record_type, record_id),
    CONSTRAINT fk_audit_event FOREIGN KEY (event_id) REFERENCES events(id)
);
```

**Affected Files:** `design-spec.md` (section C), `jira-ticket.md` (schema section, deliverables).

---

## Gap 4: Pub/Sub Topic Names Missing Environment Isolation (High)

**V10 Reference:** P2.10 (lines 2569-2573):

> SuiteX must maintain separate event streams per NetSuite environment:
> - `events.raw.production`
> - `events.raw.sandbox`
> Prevent cross-environment pollution (sandbox events must not update production SuiteX).

**Current State:** Epic 2 provisions topics with flat names (`events.raw`, `events.merged`, `events.dlq`). The GCS bucket correctly includes environment suffixes (`suitex-payload-snapshots-{environment}`), but the Pub/Sub topics do not.

**Impact:** Without environment-specific topics, a NetSuite sandbox account's events would enter the same `events.raw` topic as production accounts. If the merge service processes a sandbox event and applies it to SuiteX production data, it causes cross-environment data corruption.

**Required Change:** Either:

**Option A (Topic-level isolation, matches V10):** Create environment-specific topics:
- `events.raw.production`, `events.raw.sandbox`
- `events.merged.production`, `events.merged.sandbox`
- `events.error.production`, `events.error.sandbox`
- `events.dlq.production`, `events.dlq.sandbox`

**Option B (Subscription-level isolation):** Use a single set of topics but enforce filtering at the subscription level using Pub/Sub message attributes (e.g., `environment = production`). Simpler infrastructure but weaker isolation.

**Option C (Project-level isolation):** Use separate GCP projects per environment. Strongest isolation but highest operational overhead.

Option A is recommended as it directly matches V10's specification and provides topic-level isolation without the overhead of separate GCP projects. The IaC scripts should parameterize topic names by environment.

**Affected Files:** `design-spec.md` (section A), `jira-ticket.md` (Pub/Sub Topics, IaC).

---

## Gap 5: Eloquent Model Placement Violates Directory Structure (Medium)

**V10 Reference:** N/A (this is a core project standards issue).

**Core Standards** (`docs/ai-rules/000-core-standards.md`, lines 35-40):

> Domain-specific models must be placed in `src/Domain/{DomainName}/Models/`.

**Current State:** The jira-ticket specifies: "Create `app/Models/EventStore.php`."

**Impact:** The EventStore model belongs to the Events domain. Placing it in `app/Models/` violates the project's domain-driven directory structure and sets a bad precedent for all subsequent sync-related models.

**Required Change:** Place the model at `src/Domain/Events/Models/Event.php` (matching V10's table name `events`, not `event_store`). The model should use the `$connection` property to route to the dedicated instance:

```php
namespace Domain\Events\Models;

use Illuminate\Database\Eloquent\Model;

class Event extends Model
{
    protected $connection = 'event_store';
    protected $table = 'events';
    protected $keyType = 'string';
    public $incrementing = false;
}
```

**Affected Files:** `jira-ticket.md` (Eloquent Model section).

---

## Gap 6: No Immutability Enforcement Mechanism (Medium)

**V10 Reference:** Line 44: "Append-only store." Line 44 in the architecture section establishes this as a structural guarantee, not just a convention.

**Current State:** Both Epic 2 files state "Application code must never execute UPDATE or DELETE statements" and "The `event_store` table is strictly append-only." However, neither document describes how this constraint is enforced. A code comment or a convention is not enforcement -- any developer could accidentally call `$event->update()` or `Event::destroy()`.

**Required Change:** Define at least two enforcement layers:

**1. Database-level (strongest):** The `sync_worker` MySQL user should be granted only `INSERT` and `SELECT` privileges on the `events` table. No `UPDATE` or `DELETE`:

```sql
GRANT SELECT, INSERT ON events.events TO 'sync_worker'@'%';
GRANT SELECT, INSERT ON events.event_audit_log TO 'sync_worker'@'%';
```

**2. Application-level (defense in depth):** The Eloquent model should override boot methods to prevent accidental mutations:

```php
protected static function booted(): void
{
    static::updating(function () {
        throw new \RuntimeException('Events are immutable. UPDATE is prohibited.');
    });

    static::deleting(function () {
        throw new \RuntimeException('Events are immutable. DELETE is prohibited.');
    });
}
```

**Affected Files:** `design-spec.md` (Architectural Boundaries), `jira-ticket.md` (schema section).

---

## Gap 7: No Retention or Archival Strategy (Medium)

**V10 Reference:** Non-functional requirements (line 93): "retain logs for audits (configurable retention)." Appendix A6 (lines 738-742): "GCS/BigQuery — Stores all historic events for audit; Analytics on sync patterns, volume, failure rates." Idempotency key retention (line 2442): "minimum 30 days."

**Current State:** Epic 2 defines a 30-day lifecycle for the GCS snapshot bucket but says nothing about the event store table's growth, retention, or archival.

**Impact:** The `events` table is append-only and will grow indefinitely. For a tenant with moderate sync activity (e.g., 1,000 events/day), the table would reach ~365K rows/year per tenant. Across many tenants, this can become a performance and storage concern. Without a defined strategy, the table will eventually degrade query performance for projection rebuilds and audit trail lookups.

**Required Change:** Document the retention strategy in the design spec:

1. **Hot storage (Cloud SQL):** Retain events for a configurable period (minimum 90 days per V10's audit requirements). The `created_at` + `processed` columns enable partition-based cleanup.
2. **Cold storage (GCS/BigQuery):** After the retention window, archive events to GCS (as JSONL) or BigQuery for long-term audit and analytics, matching V10 Appendix A6.
3. **Partition strategy:** Consider MySQL 8.x range partitioning on `created_at` by month to enable efficient partition drops during archival.

This doesn't need to be implemented in Epic 2, but the schema design should accommodate it (e.g., `created_at` as part of the partition key) and the design spec should document the intended strategy.

**Affected Files:** `design-spec.md` (new section or addendum to Architectural Boundaries).

---

## Gap 8: V10 DDL Uses PostgreSQL Types but Epic 2 Specifies MySQL (Low)

**V10 Reference:** The `events` DDL (lines 240-251) uses `jsonb`, `timestamptz`, and `uuid` -- all PostgreSQL-specific types. The `current_state` and `conflicts` DDLs also use PostgreSQL types.

**Current State:** Epic 2 specifies MySQL 8.x, which is consistent with SuiteX's existing `config/database.php` default (`DB_CONNECTION=mysql`).

**Impact:** This is a V10 documentation issue, not an Epic 2 error. V10's DDL examples were written with PostgreSQL syntax but the actual SuiteX infrastructure uses MySQL. The Epic 2 migration must use MySQL-compatible types:

| V10 (PostgreSQL) | MySQL 8.x Equivalent |
|---|---|
| `uuid` | `CHAR(36)` |
| `text` | `VARCHAR(255)` or `TEXT` |
| `jsonb` | `JSON` |
| `timestamptz` | `TIMESTAMP` |
| `bigint` | `BIGINT UNSIGNED` |
| `boolean default false` | `TINYINT(1) DEFAULT 0` |

**Required Change:** No change needed in Epic 2 -- it correctly targets MySQL. However, document the type mapping in the design spec so future epics that reference V10's PostgreSQL DDL examples know the correct MySQL equivalents.

**Affected Files:** `design-spec.md` (reference note).

---

## Gap 9: Missing Ordering Key Documentation for Publishers (Low)

**V10 Reference:** Appendix A4 (line 684): Ordering key format is `"<type>:<id>"`. Pub/Sub ordering requires publishers to set the ordering key as a message attribute.

**Current State:** Epic 2 correctly enables "Message Ordering" on the Pub/Sub topics. However, ordering is a publisher-side concern -- the topic just needs to have ordering enabled; publishers must set the `orderingKey` attribute on every message. Epic 2 doesn't document the expected ordering key format.

**Impact:** Low for Epic 2 specifically (since Epic 2 provisions infrastructure, not emitters). But a note in the design spec would prevent future epic developers from publishing messages without ordering keys, which would silently bypass the ordering guarantee.

**Required Change:** Add a note to the design spec:

> All publishers to `events.raw` and `events.merged` MUST set the Pub/Sub message ordering key attribute to `recordType:recordId` (e.g., `project:12345`). Messages published without an ordering key will not benefit from per-record ordering guarantees even though the topic has ordering enabled.

**Affected Files:** `design-spec.md` (section A, as a note).

---

## Gap 10: Subscription Topology Not Defined (Low)

**V10 Reference:** Appendix A5 (lines 688-723) defines five consumer types, each with its own subscription:
- Merge/Normalize Consumer → subscribes to `events.raw`
- NetSuite Sync Consumer → subscribes to `events.merged`
- SuiteX Sync Consumer → subscribes to `events.merged`
- Reconciliation Consumer → background
- DLQ Handler → subscribes to `events.dlq`

**Current State:** Epic 2 mentions `events.raw-sub` in Scenario 3 but doesn't define the full subscription topology.

**Impact:** Low for Epic 2's scope. Subscriptions can be created when consumers are built in later epics. However, provisioning subscriptions alongside topics ensures the IaC is complete and consumers have a defined contract to code against.

**Required Change:** At minimum, provision the following subscriptions in the IaC:

| Subscription | Topic | Consumer |
|---|---|---|
| `events.raw-merge-sub` | `events.raw` | Merge Service (Epic 5+) |
| `events.merged-netsuite-sub` | `events.merged` | NetSuite Writer |
| `events.merged-suitex-sub` | `events.merged` | SuiteX Writer |
| `events.error-retry-sub` | `events.error` | Error Handler |
| `events.dlq-sub` | `events.dlq` | DLQ Handler |

If provisioning subscriptions is deferred, document the intended topology so it's tracked.

**Affected Files:** `jira-ticket.md` (GCP Infrastructure section).

---

## Summary of Required Changes

| Gap | Severity | Files Affected | Change Type |
|-----|----------|----------------|-------------|
| 1. Missing `events.error` topic | **High** | design-spec, jira-ticket | Add fourth topic |
| 2. Event store schema mismatches | **Critical** | design-spec, jira-ticket | Rename columns, add missing columns, fix PK |
| 3. Missing `event_audit_log` table | **High** | design-spec, jira-ticket | Add second table to migration |
| 4. No environment isolation on topics | **High** | design-spec, jira-ticket | Add environment suffixes |
| 5. Model placement violation | **Medium** | jira-ticket | Move to `src/Domain/Events/Models/` |
| 6. No immutability enforcement | **Medium** | design-spec, jira-ticket | Add DB grants + model guards |
| 7. No retention/archival strategy | **Medium** | design-spec | Document strategy, add partition key |
| 8. PostgreSQL vs MySQL type mapping | **Low** | design-spec | Document type mapping table |
| 9. Missing ordering key docs | **Low** | design-spec | Add publisher note |
| 10. Subscription topology undefined | **Low** | jira-ticket | Define or defer with documentation |

---

## Corrected Event Store Migration (MySQL 8.x)

Incorporating gaps 2, 3, 6, and 7:

```sql
-- Migration: create_event_store_tables
-- Connection: event_store (dedicated HA Cloud SQL instance)
-- Placement: database/migrations/ (root, not tenant)

CREATE TABLE events (
    id CHAR(36) NOT NULL,
    account_id VARCHAR(255) NOT NULL,
    record_type VARCHAR(255) NOT NULL,
    record_id VARCHAR(255) NOT NULL,
    source VARCHAR(50) NOT NULL COMMENT 'netsuite-ue | netsuite-poll | suitex | reconciliation',
    `timestamp` TIMESTAMP(6) NOT NULL COMMENT 'When the change occurred in source system',
    base_version BIGINT UNSIGNED DEFAULT NULL COMMENT 'current_state projection version at event origin',
    changes JSON NOT NULL COMMENT 'Field-level deltas (normalized per Epic 1 schema)',
    full_snapshot_ref TEXT DEFAULT NULL COMMENT 'GCS URI for full snapshot fallback (gs://...)',
    processed TINYINT(1) NOT NULL DEFAULT 0 COMMENT 'Whether merge service has consumed this event',
    created_at TIMESTAMP(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) COMMENT 'When this row was inserted',

    PRIMARY KEY (id),
    INDEX idx_events_aggregate (account_id, record_type, record_id),
    INDEX idx_events_unprocessed (processed, created_at),
    INDEX idx_events_source_time (source, `timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;


CREATE TABLE event_audit_log (
    id CHAR(36) NOT NULL,
    event_id CHAR(36) NOT NULL,
    record_type VARCHAR(255) NOT NULL,
    record_id VARCHAR(255) NOT NULL,
    source VARCHAR(50) NOT NULL,
    fingerprint CHAR(64) NOT NULL COMMENT 'SHA-256 of (recordType, recordId, lastModified, operation)',
    processing_outcome VARCHAR(50) DEFAULT NULL COMMENT 'processed | duplicate | circular_echo | confirmation',
    created_at TIMESTAMP(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),

    PRIMARY KEY (id),
    INDEX idx_audit_fingerprint (fingerprint),
    INDEX idx_audit_record (record_type, record_id),
    INDEX idx_audit_event (event_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;


-- Immutability enforcement: sync_worker user gets INSERT + SELECT only
-- GRANT SELECT, INSERT ON events.* TO 'sync_worker'@'%';
```
