# Epic 1: Canonical Schema & Anti-Corruption Layer — Jira Ticket

## Business Justification & Problem Statement

We are building a highly concurrent, bidirectional sync engine between SuiteX and NetSuite. We currently face critical data type mismatches: NetSuite has inconsistent internal data representations (e.g., Checkboxes return as "T"/"F", dates are timezone-dependent, and multiselect array orders are not guaranteed). If we allow NetSuite's raw formats or SuiteX's raw database rows to enter the durable event stream, our Three-Way Merge orchestrator will fail. Furthermore, we cannot unblock the frontend team to build the Node.js Mock Server (Epic 3) without knowing exactly what data structures they need to validate against.

## Proposed Solution

To resolve this without tightly coupling SuiteX to NetSuite, we will establish an **Anti-Corruption Layer (ACL)** through strict JSON Schema validation. We will define a universal **Canonical Event Envelope** and specific record payload schemas (restricted to Project and Project Task for this MVP slice). Both the NetSuite emitters and SuiteX publishers must normalize their data to match this canonical JSON schema before publishing to the event backbone.

## Technical Requirements

### The Canonical Event Envelope

- Define the universal JSON wrapper that all events must use.
- **Must require:** `eventId`, `accountId`, `recordType`, `recordId`, `eventType`, `source`, `timestamp`, and `changes`.
- **Must enforce** attribution metadata for circular update prevention: **sourceSystem** (enum) and **writeId** (UUID).
- **Must include** a **schemaVersion** field to support future-proof schema evolution.

### Payload Normalization Standards (Project & Project Task)

Define strict JSON data types within the `changes` payload to override NetSuite quirks:

- **Checkboxes** mapped strictly to JSON booleans (`true`/`false`).
- **Dates** normalized to ISO8601 UTC strings.
- **Multiselect arrays** sorted alphanumerically.

> **Note:** Utilize `patternProperties` to allow dynamically named custom fields (e.g. `^custentity_[a-zA-Z0-9_]+$`), delegating their strict type validation to the metadata table.

### Multi-Tenant Schema Registry (field_metadata)

- Define the SQL DDL for the central SuiteX metadata table.
- **Must** be scoped by `account_id` to support multi-tenant custom field mappings.
- **Required columns:** `account_id`, `record_type`, `field_id`, `field_type`, `is_synced`, `is_readonly`, `normalization_rule`, `conflict_policy`.

## Acceptance Criteria (Definition of Done)

### Scenario 1: Valid Payload Acceptance (The Happy Path)

- **Given** a formatted JSON payload representing a record change (e.g., Project) that utilizes strict booleans (`true`/`false`), ISO8601 UTC formatted dates, and alphanumerically sorted multiselect arrays.
- **When** the payload is processed by the Canonical JSON Schema validator.
- **Then** the schema validation passes with zero errors, confirming the payload is safely formatted for the `events.raw` Pub/Sub topic.

### Scenario 2: Normalization Rejection (The NetSuite Quirk Trap)

- **Given** an inbound payload generated by a NetSuite User Event script that contains raw, un-normalized NetSuite data types (e.g., a checkbox represented as the string `"T"` or a localized date string like `"11/15/2025"`).
- **When** the payload is evaluated by the Canonical JSON Schema validator.
- **Then** the schema validator explicitly throws a type/format error, rejects the payload as a "Permanent error," and routes it to the Dead Letter Queue to prevent data corruption.

### Scenario 3: Attribution Metadata Enforcement (The Infinite Loop Trap)

- **Given** a SuiteX outbound change event or a Legacy API "Shadow Event".
- **When** the event envelope is generated and evaluated by the validator.
- **Then** the schema validation strictly requires and confirms that `sourceSystem` is set to `suitex` and a valid `writeId` UUID is present, ensuring the orchestrator can track idempotency and prevent circular loops.

## Deliverables

| Deliverable | Description |
|-------------|-------------|
| **Canonical Envelope Schema** | Fully defined JSON schema draft-07 file. |
| **Payload Schemas** | JSON schema files specifically for Project and Project Task. |
| **SQL DDL** | The SQL creation script for the multi-tenant `field_metadata` table. |
| **Validation Test Script** | A basic unit test script (e.g., Node.js/Ajv or PHP equivalent) proving the schemas successfully pass/reject mocked data based on the ACs above. |
