# Epic 1 Technical Spec: Canonical JSON Schema & Anti-Corruption Layer

Define the strict JSON Schema contracts and normalization rules that form the Anti-Corruption Layer between NetSuite and SuiteX. This ensures the durable event stream only processes sanitized, validated, and deterministic data.

## Folder Structure

These files are intended to represent data that doesn’t align with either NetSuite or SuiteX. It’s essentially an intermediary layer we use to ensure schema changes within SuiteX and NetSuite don’t suddenly break the process. As such, the files should sit in a separate folder away from the standard domain data files. 

```text
suitex/
├── schemas/                 # THE CONTRACT REGISTRY
│   ├── envelope/
│   │   └── v1.schema.json   # See file below in this tech spec
│   └── payloads/
│       ├── project/
│       │   └── v1.schema.json
│       └── project-task/
│           └── v1.schema.json
```

## The Canonical Event Envelope Schema

All events published to the `events.raw` Pub/Sub topic—whether from NetSuite UE scripts, the NetSuite Poller, SuiteX domain services, or Legacy API "Shadow Events"—MUST conform to this exact JSON draft-07 schema.

### JSON Schema Definition (v1.0.0)

```json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "CanonicalChangeEventEnvelope",
  "description": "The universal wrapper for all SuiteX <-> NetSuite synchronization events. Enforces the Anti-Corruption Layer and circular update prevention.",
  "type": "object",
  "required": [
    "schemaVersion",
    "eventId",
    "accountId",
    "recordType",
    "recordId",
    "eventType",
    "source",
    "timestamp",
    "changes",
    "sourceSystem",
    "writeId"
  ],
  "properties": {
    "schemaVersion": {
      "type": "string",
      "description": "Enables zero-downtime schema evolution (e.g., 'v1', 'v2').",
      "enum": ["v1"]
    },
    "eventId": {
      "type": "string",
      "format": "uuid",
      "description": "Unique identifier for this specific event instance."
    },
    "accountId": {
      "type": "string",
      "description": "The tenant identifier, used to route to the correct database and fetch tenant-specific field_metadata."
    },
    "recordType": {
      "type": "string",
      "description": "The standard NetSuite record type (e.g., 'project', 'projecttask', 'customer')."
    },
    "recordId": {
      "type": "string",
      "description": "The NetSuite internalId or the SuiteX primary key."
    },
    "eventType": {
      "type": "string",
      "enum": ["create", "update", "delete"]
    },
    "source": {
      "type": "string",
      "enum": ["netsuite-ue", "netsuite-poll", "suitex"],
      "description": "The physical origin of the event emission."
    },
    "timestamp": {
      "type": "string",
      "format": "date-time",
      "description": "ISO8601 UTC timestamp of when the change occurred."
    },
    "baseVersion": {
      "type": "string",
      "description": "Projection version the event was derived from (used for Three-Way Merge). Omitted if unavailable."
    },
    "changes": {
      "type": "object",
      "description": "Minimal field-level deltas containing normalized values. Schema details are delegated to record-specific schemas."
    },
    "fullSnapshotRef": {
      "type": "string",
      "description": "Optional URI pointing to a full state backup in cloud storage, used if deltas cannot be reliably generated."
    },
    "sourceSystem": {
      "type": "string",
      "enum": ["suitex", "netsuite", "workflow", "user"],
      "description": "Attribution metadata strictly required to prevent infinite circular update loops."
    },
    "writeId": {
      "type": "string",
      "format": "uuid",
      "description": "The idempotency key representing the atomic write across systems."
    },
    "transactionGroupId": {
      "type": ["string", "null"],
      "format": "uuid",
      "description": "Used to group multiple events for batched RESTlet execution."
    }
  },
  "additionalProperties": false
}
```

**Implementation Notes:**

The inclusion of schemaVersion is mandatory to support future schema migrations without breaking the sync workers.

The `sourceSystem` and `writeId` fields are strictly required to support circular update prevention and the legacy API "Shadow Event" emission.

### Why this structure is so powerful

Notice how changes is just defined as "type": "object" in this envelope?

This is the secret to keeping our system modular. When the NestJS Mock Server or SuiteX Orchestrator validates an event, it will use a schema compiler (like Ajv) to stitch this Envelope schema together with the specific Payload schema (e.g., the Project schema).

The validator first checks if the envelope is structurally sound (preventing circular loops by checking for `writeId` and `sourceSystem`), and then it dives into the `changes` object to ensure the data inside matches the Project rules.

## Payload Normalization Standards

NetSuite's internal data types are highly inconsistent and cannot be fed directly into the changes payload object. The producing emitters (UE Scripts, Node.js Mock Server, SuiteX Outbox) MUST apply the following normalization rules before publishing.

Checkbox Fields: NetSuite represents these variously as "T", "F", true, or false. Emitters must strictly normalize these to JSON booleans: true or false.

Date / Datetime Fields: Dates are typically timezone-dependent in NetSuite user contexts. Emitters must strictly normalize all temporal data to ISO8601 UTC strings (e.g., YYYY-MM-DDThh:mm:ssZ).

Multiselect Fields: NetSuite returns arrays of internal IDs where the order is not guaranteed. Emitters must sort these arrays alphanumerically before attachment to prevent the orchestrator from detecting false-positive deltas.

- **Computed/Formula Fields:** System-calculated fields (e.g., totals) must be excluded entirely from outbound change events.

## The Schema Registry & Metadata Table (Multi-Tenant)

To instruct the SuiteX Orchestrator on how to validate and merge these normalized payloads—especially dynamically mapped custom fields—the root database requires a tenant-scoped configuration table. 

### Table DDL: `field_metadata`

This table acts as the semantic registry for the Anti-Corruption Layer. Because tenants can define custom fields (e.g., `custentity_supplier`), the primary key must include the `account_id`.

```sql
CREATE TABLE field_metadata (
    account_id text,          -- NEW: Tenant Isolation
    record_type text,
    field_id text,
    field_type text,          -- e.g., 'checkbox', 'multiselect', 'datetime'
    is_synced boolean,        -- true if part of the sync surface area
    is_readonly boolean,      -- true for NetSuite-owned computed fields
    normalization_rule text,  -- JSON config dictating the expected structure
    conflict_policy text,     -- 'netsuite-wins', 'suitex-wins', 'manual', 'last-write-wins'
    PRIMARY KEY (account_id, record_type, field_id) -- NEW: Composite PK
);
```

## Two-Stage Validation & Error Routing Protocol

To support infinite custom fields while maintaining strict data governance, Consumer Workers subscribing to the events.raw topic must enforce a Two-Stage Validation Flow:

Stage 1: Structural Validation (The JSON Schema): The worker checks the payload against the JSON Schema. Standard fields are strictly checked. Custom fields are allowed only if their keys match NetSuite's custom field naming regex (patternProperties).

Stage 2: Semantic Validation (The Metadata Table): The worker loops through any custom fields found, queries the central field_metadata table using the event's account_id, and applies the specific tenant's normalization_rule. If the tenant defined a custom field as an Integer, but the payload sent a String, it fails here.

- **Failure Classification & Routing:** Any payload that fails Stage 1 or Stage 2 is immediately classified as a Permanent Error. The worker will route the offending event to the Dead Letter Queue (DLQ) and issue an alert. The bad data never touches the SuiteX database.

## Example JSON Spec and Usage

### The Customer Payload Contract (Anti-Corruption Layer)

Notice how this schema forces NetSuite's notorious quirks (like "T"/"F" checkboxes and unstructured dates) into strict, web-standard formats.

```json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "CustomerPayload",
  "description": "Canonical schema for a Customer record's changes payload. Supports multi-tenant custom fields via patternProperties.",
  "type": "object",
  "properties": {
    "companyName": {
      "type": "string",
      "description": "The primary name of the customer or company."
    },
    "email": {
      "type": "string",
      "format": "email"
    },
    "phone": {
      "type": "string",
      "pattern": "^\\+?[1-9]\\d{1,14}$",
      "description": "E.164 standard formatted phone number."
    },
    "isInactive": {
      "type": "boolean",
      "description": "Strict boolean. NetSuite 'T'/'F' must be normalized before validation."
    },
    "subsidiaryId": {
      "type": "string",
      "description": "Internal ID of the NetSuite subsidiary. Must be a string, not an embedded object."
    },
    "contractStartDate": {
      "type": "string",
      "format": "date",
      "description": "ISO8601 Date (YYYY-MM-DD). Timezone stripped."
    }
  },
  "patternProperties": {
    "^custentity_[a-zA-Z0-9_]+$": {
      "description": "Allows any NetSuite Entity Custom Field. Data type validation is deferred to Stage 2 (field_metadata table)."
    }
  },
  "additionalProperties": false
}
```

### How the Anti-Corruption Layer Uses This

Here is an example of what the validation layer accepts versus what it rejects.

**❌ REJECTED** (What NetSuite natively tries to send): If the NetSuite UE script tries to send raw data without normalization, the JSON schema throws a "Permanent Error" and routes it to the Dead Letter Queue:

```json
{
  "companyName": "Acme Corp",
  "isInactive": "F",                  // ERROR: Expected boolean, got string
  "contractStartDate": "11/15/2025",  // ERROR: Fails ISO8601 date format
  "targetMarketIds": [45, 12]         // ERROR: Expected strings, got integers

}
```

*(Expected boolean for `isInactive`; ISO8601 required for `contractStartDate`; strings required for `targetMarketIds`.)*

**✅ ACCEPTED** (What the Mock Server & UE Scripts MUST send): The emitter runs the normalization functions, resulting in a payload that passes validation and is safely merged by the Orchestrator:

```json
{
  "companyName": "Acme Corp",
  "isInactive": false,
  "contractStartDate": "2025-11-15",
  "targetMarketIds": ["12", "45"]
}
```

