# Epic 1: Canonical Schema & Anti-Corruption Layer — Design Spec

## Business Justification & Problem Statement

We are building a highly concurrent, bidirectional sync engine between SuiteX and NetSuite. We currently face critical data type mismatches and unpredictable payloads:

- NetSuite has inconsistent internal data representations (e.g., Checkboxes return as "T"/"F" in SuiteScript but true/false in REST, dates are timezone-dependent, and multiselect array orders are not guaranteed).
- If we allow NetSuite's raw formats or SuiteX's raw database rows to enter the durable event stream, our Three-Way Merge orchestrator will fail or create false-positive conflicts.
- We cannot build the Node.js Mock Server without knowing exactly what data structures it needs to validate against.

## Proposed Architectural Solution

To resolve this without tightly coupling SuiteX to NetSuite, we will establish an **Anti-Corruption Layer (ACL)** through strict JSON Schema validation. We will define a universal **Canonical Event Envelope** and specific record payload schemas (e.g., Customer, Sales Order). Both the NetSuite emitters (UE Scripts/Poller) and SuiteX publishers must normalize their data to match this canonical JSON schema before publishing to the `events.raw` Pub/Sub topic.

## Explicit Functional Requirements

### A. The Canonical Event Envelope

- Define the universal JSON wrapper that all events must use. It must require: `eventId`, `accountId`, `recordType`, `recordId`, `eventType`, `source`, `timestamp`, and `changes`.
- It must enforce the inclusion of attribution metadata for circular update prevention: **sourceSystem** (enum: `suitex` | `netsuite` | `workflow` | `user`) and **writeId** (UUID).
- It must include a **schemaVersion** field to support future-proof schema evolution without downtime.

### B. Field Normalization Standards (The Payload)

Define strict JSON data types to override NetSuite quirks within the `changes` payload:

- **Checkboxes** must be strictly mapped to JSON booleans (`true`/`false`).
- **Dates and Datetimes** must be strictly normalized to ISO8601 UTC strings.
- **Multiselect arrays** must be sorted alphanumerically before being added to the payload to prevent false-positive change detection.

### C. Schema Registry & Multi-Tenant Validation Mapping

- Establish a central repository for these JSON files.
- Define the schema for the central SuiteX **field_metadata** table. This table **MUST** be scoped by `account_id` so that individual tenants can map their own NetSuite custom fields to SuiteX without polluting the global schema.

## Architectural Boundaries & Technical Constraints

- **No Direct Coupling:** The SuiteX database schema and NetSuite custom records must remain entirely independent. The JSON contract is the only shared definition.
- **Deltas over Snapshots:** The `changes` object in the schema must be designed to accept field-level deltas, not full record snapshots, to minimize payload size and ensure the three-way merge is deterministic.
- **Strict Validation Rejection:** Any payload hitting the merge worker that fails JSON schema validation must be classified as a "Permanent error" (invalid payload) and routed to a dead-letter/error queue immediately.
- **Two-Stage Validation:** The Anti-Corruption Layer must not tightly couple custom fields to the hardcoded JSON Schema. The Orchestrator must perform **Stage 1 Structural Validation** (using Regex patterns in the JSON schema to allow custom fields) followed by **Stage 2 Semantic Validation** (querying the `field_metadata` table by `account_id` to enforce the exact data type the tenant expects).

## Operational Dependencies Flagged

- **Blocker for Node.js Mock Server:** The development of the Node.js NetSuite Mock Server (Epic 3) cannot begin until these schemas are finalized, as the mock server relies entirely on these contracts to simulate valid NetSuite RESTlet responses.
- **Legacy API Coexistence:** The legacy REST API must be updated to ensure its "Shadow Events" conform to this exact schema, specifically including the `sourceSystem: "suitex"` requirement.

## Testable Acceptance Criteria (Definition of Done)

To close this Epic, the development team must deliver the JSON Schema files, the SQL table definition (`field_metadata`), and a simple test script (e.g., a basic Node.js Ajv validator or PHP equivalent) that successfully proves the following three scenarios:

### Scenario 1: Schema Validation (Pass)

- **Given** a formatted JSON event representing a SuiteX change with a valid UUID `writeId`, strict boolean checkboxes, and ISO8601 dates.
- **When** validated against the Canonical JSON Schema.
- **Then** the validation passes with no errors, confirming it is safe for the `events.raw` topic.

### Scenario 2: Normalization Failure Rejection (Fail — Bad Data Type)

- **Given** an inbound event from a NetSuite User Event script where a checkbox field is passed as the string `"T"` instead of a boolean `true`.
- **When** the payload hits the schema validation layer.
- **Then** the validator explicitly rejects the payload, classifying it as a Permanent error, and routes it to the error queue.

### Scenario 3: Attribution Metadata Enforcement (Circular Loop Prevention)

- **Given** a SuiteX outbound change event or a Legacy API "Shadow Event".
- **When** the event envelope is generated.
- **Then** the schema validation strictly requires and confirms that `sourceSystem` is set to `suitex` and a valid `writeId` UUID is present, ensuring the orchestrator can track idempotency.

## Summary of the Deliverables

The following are expected deliverables by the end of this Epic:

| Deliverable | Description |
|-------------|-------------|
| **Canonical Envelope Schema** | Fully defined and locked. |
| **SQL DDL** | The multi-tenant `field_metadata` table is created in the database. |
| **Payload Schemas** | **Only** Project and Project Task are defined. |
| **Validation Tests** | The test script successfully passes/rejects mocked Project data based on NetSuite quirks (like dates and booleans). |
