# SuiteX ↔ NetSuite Sync — Design, Requirements, Deliverables, and Runbook

**Created:** 2025-11-14
**Version:** 2
**Goal:** Reliable, near-real-time (target ≤ 1 minute) bidirectional sync between SuiteX and NetSuite that captures: UE-driven events, workflow/poller-detected changes, SuiteX-originated changes, and deletes. Ensure correctness through event-sourcing/delta model, three-way merging, conflict handling, robust error handling, auditability, and operational controls to respect NetSuite governance.

**V2 Updates:** Added comprehensive NetSuite-specific implementation details including UE trigger semantics, field-level considerations, governance limits, API specifications, polling implementation, event ordering constraints, and retry behaviors.

---

## Executive summary

This document describes an enterprise-grade architecture to merge four streams and sync state between SuiteX and NetSuite:

- UE events (User Event scripts inside NetSuite)
- Poller events (external polling of NetSuite via `lastModifiedDate` / system notes / getDeleted)
- SuiteX-originated outbound changes (create/update/delete)
- SuiteX custom-record staging + RESTlet workers (existing)

Primary principles:

1. **Event-first and deltas**: All changes are emitted as immutable events with `changes` (field-level deltas) and `baseVersion` (logical version). Avoid full-record snapshots as the primary channel.
2. **Single durable event stream**: Central event store (Kafka/SQS DB-backed events) partitions by `(account, recordType, recordId)` to preserve per-record ordering.
3. **Polling is authoritative for missed events**: Polling confirms and fills gaps; UE provides early visibility and reduces latency.
4. **Three-way merge with per-field policies**: Deterministic merges using `Base` (the version event was derived from), `LocalChange` (origin change) and `RemoteCurrent` (target latest). Use configurable per-field conflict policies.
5. **Operational controls**: Adaptive concurrency, batching, debounce/coalescing, and conflict queue/human review UI.

---

## High-level architecture

1. **Producers / Event sources**
   - NetSuite UE script: emits events on UI/API-originated actions (field-level deltas plus metadata). Writes to `events` queue/topic.
   - NetSuite Poller: discovery and detail fetch by `lastModifiedDate` + `(lastInternalId)` cursor, emits events for changes missed by UE. Writes to `events` queue/topic.
   - SuiteX change capture: when SuiteX creates/updates/deletes, produce events (field-deltas) to the same `events` queue/topic.
     - SuiteX change capture is implemented inside the SuiteX application layer (e.g., Laravel domain services). On commit of a transactional change to the tenant database, a `ChangeEvent` is produced and published to the durable backbone with an ordering key derived from `(account, recordType, recordId)`.
     - For SuiteX-originated changes, the workflow engine ensures that the domain write and the event publication are coordinated: either by transactional outbox pattern (preferred) or by explicit retry with idempotency keys.
     - SuiteX adds metadata such as tenant identifier, user context, and correlation IDs so downstream consumers can join events back to SuiteX audit logs and UI.
   - SuiteX snapshot fallback: when deltas can't be created, store a full snapshot and emit `fullSnapshotRef`.
     - Snapshot fallback is performed by SuiteX background workers that read from the same event backbone but are driven by reconciliation or migration jobs. These workers persist snapshots to object storage (for example, GCS) and persist a pointer in `fullSnapshotRef`.
     - SuiteX maintains a per-record configuration describing which fields are part of the synchronized surface area; snapshot producers respect that configuration so that `fullSnapshotRef` remains compatible with NetSuite-facing workloads.

2. **Durable Event Store**
   - Append-only store with partitioning: `account|recordType|recordId`.
   - Store event envelope (eventId, timestamp, source, baseVersion, changes, metadata).
   - In concrete implementation, SuiteX uses the external backbone (for example, Google Pub/Sub) for ordered delivery and a Cloud SQL-backed `events` table for durable, queryable history and idempotency tracking. Pub/Sub carries the live stream; Cloud SQL provides random-access history and analytics.

3. **Orchestrator / Sync Workers**
   - Consumer(s) read events, group by record and apply deterministic merge logic to the opposite system (NetSuite ↔ SuiteX).
   - Use optimistic concurrency (check remote version) and three-way merge if mismatch.
   - SuiteX implements orchestrator workers as stateless processes (for example, Laravel queue workers) that:
     - Read from Pub/Sub subscriptions bound to `events.raw` and `events.merged`.
     - Resolve tenant context based on `accountId` and SuiteX internal tenant mapping.
     - Persist and update `current_state` and conflict records inside the tenant-aware Cloud SQL database.

4. **Projection / Current State**
   - Maintain `current_state` projection per record (versioned). This is authoritative for the sync service.
   - Projections live inside SuiteX-managed Cloud SQL tables scoped by tenant. SuiteX exposes read-only APIs over this projection to internal services and admin tools, while NetSuite-side components continue to treat SuiteX as a black box accessed via RESTlet/REST APIs.

5. **Conflict Queue & Resolution UI**
   - Conflicts stored for manual or semi-automated resolution with change context and suggested merge.
   - SuiteX hosts the conflict queue and UI. NetSuite remains the source of UE and polling events but does not host any conflict UX. The SuiteX admin UI presents three-way merge context (Base, Local, Remote) and writes human decisions back as resolution events to the backbone.

6. **Monitoring & Alerting**
   - Metrics, traces, SLAs, and alerts for error rates, backlog growth, 429s, and conflict volume.
   - SuiteX-side observability is implemented via standard metrics (for example, Prometheus or GCP Monitoring), structured application logs, and distributed tracing from the Pub/Sub consumers through to SuiteX domain writes and NetSuite RESTlet calls.

7. **Operational Controls**
   - Debounce windows, per-account concurrency, adaptive throttling, and backfills.
   - SuiteX orchestrator workers enforce operational controls at the SuiteX layer by:
     - Using per-account concurrency pools when writing to NetSuite and SuiteX.
     - Applying debounce windows to coalesce high-frequency SuiteX UI updates before emitting outbound NetSuite writes.
     - Respecting NetSuite governance limits by honoring backoff and switching to queued writes when thresholds are exceeded, as detailed later in governance policies.

---

## Requirements (functional & non-functional)

### Functional
1. Capture all NetSuite updates (UE + workflow changes + mass updates + scheduled scripts + CSV + system) and SuiteX changes.
2. Emit immutable events representing field-level deltas with `baseVersion`.
3. Guarantee at-least-once delivery to the orchestration layer; idempotent application to targets.
4. Merge concurrent changes deterministically; configurable per-field policies.
5. Maintain audit trail and ability to reconstruct history.
6. Provide conflict queue and human-in-the-loop resolution UI.
7. Provide backpressure and throttling to respect NetSuite governance.
8. Support deletes and bulk operations safely.

### Non-functional
1. Target sync latency: ≤ 60s for typical workloads (subject to NetSuite limits). Provide SLAs per-customer profile.
2. System must be horizontally scalable (SuiteX side) with per-account throttles.
3. No silent failures — every error must be logged, retried, or escalated.
4. System must be observable with metrics, traces, and dashboards; retain logs for audits (configurable retention).
5. All external writes must be idempotent with idempotency keys persisted.

---

## Deliverables & Acceptance Criteria (staged)

### Stage 0 — Current-state validation & small refactors
**Deliverables**:
- Inventory of existing UE scripts, RESTlet workers, SuiteX payload table schema, and current snapshot flow.
- Test harness for generating concurrent SuiteX + NetSuite changes.

**Acceptance**:
- Inventory documented.
- Tests run successfully in sandbox demonstrating race conditions reproducible.

---

### Stage 1 — Event model & durable event store
**Deliverables**:
- Event schema defined (JSON envelope) and `events` table / Kafka topic created.
- Producers implemented: UE emitter, Poller emitter, SuiteX emitter.
- Lightweight event producer integration tests.

**Acceptance**:
- Events produced by all sources appear in the durable store with required fields (eventId, recordType, recordId, source, timestamp, baseVersion, changes).
- Test demonstrating replayability and ordering per-partition.

---

### Stage 2 — Projections & Versioning
**Deliverables**:
- `current_state` projection table and `version` logic.
- Initial migration of snapshot-based projection into `current_state`.
- API to fetch `current_state` by (account, recordType, recordId).

**Acceptance**:
- Projections reflect authoritative state after replaying events.
- Version increments deterministically after apply.

---

### Stage 3 — Orchestrator + Three-way merge
**Deliverables**:
- Worker that consumes events, fetches remote target state, runs three-way merge, applies merged changes via REST/SOAP, updates projection.
- Per-field conflict policy engine (config-driven).
- Idempotency & optimistic concurrency primitives.

**Acceptance**:
- Automated tests covering: no-concurrency, simple concurrency (non-conflicting fields), conflicting fields with configured policy, conflict queueing.
- End-to-end demo: SuiteX change + NetSuite poller change concurrently produce merged result and projection consistent.

---

### Stage 4 — Backpressure, batching & adaptive concurrency
**Deliverables**:
- Discovery & detail-fetch pipeline (two-phase) with cursor management.
- Batch partitioning & worker pools.
- Per-account adaptive concurrency/throttling module.
- Debounce/coalescing logic.

**Acceptance**:
- Load test demonstrating graceful handling of high-change rate (e.g., 10k changes/min) with per-account throttling preventing 429 storm.
- Configurable knobs for batch sizes, concurrency, debounce.

---

### Stage 5 — Conflict UI & human workflows
**Deliverables**:
- Conflict queue UI showing base snapshot, local change, remote current, suggested merge, timestamps, audit trail.
- APIs to resolve conflict (accept SuiteX, accept NetSuite, accept suggested merge, or edit then apply).
- Documented human SLA/ES escalation policy.

**Acceptance**:
- Manual resolution toggles work end-to-end and result is applied and projection updated.
- System prevents concurrent automated applies while record is in human review (locking semantics tested).

---

### Stage 6 — Observability, runbooks, and cutover
**Deliverables**:
- Dashboards: events/sec, backlog, 429s, conflicts, error rates per account.
- Alerts and runbooks for common failure modes.
- Migration plan for switching RESTlet snapshot workers to delta-driven workers.

**Acceptance**:
- Monitoring tests with synthetic errors trigger alerts and runbook actions.
- Successful cutover on canary account and verification of data consistency.

---

## Detailed design — components & contracts

### Event envelope (canonical)

```json
{
  "eventId": "uuid",
  "accountId": "acct-123",
  "recordType": "customer",
  "recordId": "12345",
  "eventType": "update", // create | update | delete
  "source": "SuiteX|UE|Poller",
  "timestamp": "2025-11-14T13:12:05Z",
  "baseVersion": "v123", // projection version the event was derived from
  "changes": { "fieldA": "newValue" },
  "fullSnapshotRef": "s3://.../snapshots/..." // optional
}
```

**Producer contract**:
- `baseVersion` must be present when available.
- `changes` should be minimal (field-level) when possible.
- Full snapshots allowed for schema changes or blob fields; use sparingly.


### Projection / current state

Schema (simplified):

```sql
CREATE TABLE current_state (
  account_id text,
  record_type text,
  record_id text,
  version bigint,
  state jsonb,
  last_modified timestamptz,
  PRIMARY KEY (account_id, record_type, record_id)
);
```

- `version` is monotonic per record (incremented on each authoritative apply).
- `state` contains only synced fields.


### Events table (immutable store)

Schema (simplified):

```sql
CREATE TABLE events (
  id uuid primary key,
  account_id text,
  record_type text,
  record_id text,
  source text,
  timestamp timestamptz,
  base_version bigint,
  changes jsonb,
  full_snapshot_ref text,
  processed boolean default false
);
```


### Conflict table

```sql
CREATE TABLE conflicts (
  conflict_id uuid primary key,
  account_id text,
  record_type text,
  record_id text,
  base_version bigint,
  our_event_id uuid,
  remote_version bigint,
  remote_snapshot jsonb,
  our_changes jsonb,
  status text,
  created_ts timestamptz default now()
);
```


### UE emitter (NetSuite) — contract & fallback
- UE script will emit field-deltas with `baseVersion` equal to projection version at the time of capture (if available via REST call). If projection version is not retrievable within UE constraints, include `lastModifiedDate` and let orchestrator infer base.
- UE emitter must be performant and only send deltas.


### Poller — discovery & detail fetch
- Cursor: `(lastModifiedDate, lastInternalId)` persisted per `(account, recordType)`.
- Discovery query returns `(internalId, lastModifiedDate)` sorted asc.
- Coalesce duplicate IDs per discovery window; detail fetch by getList/batched gets only required fields.
- Poller emits events that include `baseVersion` if projection version available; otherwise include `lastModifiedDate` and minimal snapshot.


### SuiteX outbound
- On user/app change, produce delta event to the event store and also insert into SuiteX payload table for compatibility.
- Create events for create/update/delete with `baseVersion` = local projection version.
- SuiteX outbound events are generated inside SuiteX domain services and use a transactional outbox pattern wherever possible:
  - Domain-layer code writes both the business change and an `outbox_events` row in the same database transaction.
  - A dedicated SuiteX background worker reads `outbox_events`, publishes `ChangeEvent` messages to the backbone, and marks outbox rows as dispatched.
  - If publishing fails, the outbox row remains and is retried without duplicating business effects.
- For legacy RESTlet snapshot workers, SuiteX continues to populate the existing payload table, but the new outbound pipeline additionally emits `ChangeEvent` deltas so both systems can run in parallel during migration.


### Orchestrator / Merge worker — algorithm summary
1. Read event E from durable store.
2. Load `Base` snapshot by `baseVersion` (from projection if available) or reconstruct from last known projection.
3. Fetch `RemoteCurrent` from target system with version.
4. If `remoteVersion == baseVersion`: apply `E.changes` directly.
5. Else: compute `remoteDelta = diff(Base, RemoteCurrent)` and `ourDelta = E.changes`. Run three-way merge per-field. If conflicts per policy -> enqueue to `conflicts`; else apply merged.
6. On apply success: update `current_state.version` and `current_state.state` atomically.
7. Emit audit log and metrics.


### Idempotency and optimistic concurrency
- Outbound applies must include idempotency keys (derived from eventId) so retries are safe.
- When writing to target, prefer APIs that accept conditional updates (e.g., REST `If-Match` style or check value before patch). If not available, implement read-verify-update pattern with transient lock/backoff.
- Within SuiteX, idempotency is enforced via:
  - `events` table primary keys and a dedicated `idempotency_keys` table recording `(target, key, first_seen_at, status)`.
  - Orchestrator workers checking whether an event has already been applied to SuiteX or NetSuite before attempting a write.
  - Explicitly storing NetSuite request identifiers (for example, in a SuiteX-side audit table) so duplicate NetSuite RESTlet calls can be detected and ignored.
- Optimistic concurrency is implemented by comparing projection versions and, for SuiteX database writes, by relying on version columns or `WHERE version = ?` updates. SuiteX never bypasses NetSuite’s own optimistic concurrency and governance constraints; it only adds SuiteX-side checks on top.


### SuiteX workflow engine & sync pipeline (implementation view)

This section describes how the above concepts map to concrete components inside SuiteX without changing any NetSuite responsibilities or constraints.

#### SuiteX workflow engine responsibilities

- Orchestrate the end-to-end flow of events through SuiteX, including:
  - Consuming raw events (`events.raw`) from the backbone.
  - Normalizing, deduplicating, and coalescing them into canonical `MergedEvent` records.
  - Deciding which actions need to be applied to SuiteX, which to NetSuite, and which are informational only.
  - Driving conflict detection and resolution, including locking records in SuiteX while human review is pending.
- Enforce per-tenant isolation by:
  - Identifying the SuiteX tenant from the `accountId` or mapping metadata on each event.
  - Executing all data access through tenant-scoped connections and models, consistent with SuiteX’s multi-tenant architecture.

#### Event ingestion and normalization (SuiteX side)

- SuiteX runs one or more stateless consumer services that:
  - Subscribe to the Pub/Sub `events.raw` topic using an ordering key of `"recordType:recordId"` to guarantee in-order delivery per record.
  - For each message, validate it against the `ChangeEvent` schema.
  - Persist the event into the `events` table inside Cloud SQL for durability and querying.
  - Use the `processEvent` logic (see Appendix F) to merge with any pending changes in an in-memory or Redis-backed buffer keyed by `(accountId, recordType, recordId)`.
- When the normalization window elapses or a flush condition is met (for example, idle time, count threshold, or explicit flag), the workflow engine:
  - Emits a `MergedEvent` onto the `events.merged` topic.
  - Clears the pending buffer for the key while leaving the immutable `events` table history intact.

#### SuiteX sync pipeline to NetSuite

- A dedicated NetSuite writer consumer in SuiteX:
  - Subscribes to `events.merged`.
  - Filters for events that require NetSuite writes (for example, SuiteX-originated updates or reconciliation corrections).
  - Applies governance-aware batching and throttling as per the governance policies described in this document.
  - Issues RESTlet or REST/SOAP calls to NetSuite with idempotency keys derived from `eventId` and record identifiers.
- For each successful write to NetSuite:
  - SuiteX updates the `current_state` projection for the record.
  - Records an audit log entry, including the NetSuite response identifiers, into an `event_audit_log` style table.
  - Marks any related idempotency keys as completed.
- On failure, the writer:
  - Classifies the error (transient, validation, conflict, operational) using the error taxonomy.
  - Routes the event to the appropriate queue (retry, error, conflict) without dropping the original message or violating NetSuite constraints.

#### SuiteX sync pipeline from NetSuite

- For NetSuite-originated UE and polling events:
  - The same normalization flow runs in SuiteX, but the resulting `MergedEvent` is interpreted as a change whose authoritative source is NetSuite.
  - When applying such changes to SuiteX:
    - Orchestrator workers update SuiteX’s tenant databases and `current_state` projections.
    - Any SuiteX-side workflows that should respond to these changes subscribe to internal SuiteX domain events rather than directly to Pub/Sub.
  - SuiteX honors NetSuite’s understanding of record ownership when conflict policies dictate that NetSuite’s values win for specific fields.

#### SuiteX record locking and conflict handling

- When the merge logic detects a conflict that cannot be auto-resolved:
  - SuiteX creates a conflict record in the `conflicts` table and marks the record as locked in a `record_lock` style table.
  - Automated writers for that record are paused in SuiteX (both to NetSuite and to SuiteX itself) while still accepting and queuing new incoming events on the backbone.
  - The admin UI allows SuiteX operators or customer admins to:
    - Inspect Base, SuiteX, and NetSuite values.
    - Choose a resolution strategy or edit a merged payload.
    - Apply the resolution, which generates a resolution event and clears the lock.

#### Relationship to existing SuiteX snapshot pipeline

- During migration:
  - The existing SuiteX snapshot-based RESTlet workers continue to run using their current staging tables.
  - The new event-driven pipeline runs alongside, with SuiteX comparing `current_state` against snapshot-based projections for a canary account.
  - Any drift is surfaced in SuiteX dashboards and can be corrected via reconciliation events.
- After cutover:
  - Snapshot workers become a fallback mechanism used only for recovery or emergency backfills.
  - The durable event pipeline (UE + polling + SuiteX events) becomes the authoritative path, fully implemented within SuiteX’s workflow engine and orchestrator services.


## Error handling & escalation (end-to-end)

### Error classes
1. **Transient errors**: network blips, HTTP 5xx, rate-limits (429) — retry with exponential backoff and jitter.
2. **Permanent errors**: authorization failure, invalid payload (400), missing required fields — route to error queue and notify.
3. **Conflict errors**: detected by three-way merge and not auto-resolvable by policy — route to conflict queue for human review.
4. **Operational errors**: DB failure, event store outages — failover to standby and alert on-call.


### Guarantees & behavior
- **At-least-once delivery** of events to orchestrator; **idempotent** application to target to avoid duplicate state changes.
- **No silent failures**: every error lands in an appropriate queue (retry/backoff, dead-letter, or conflict) and is logged.
- **Notifications**: defined levels with thresholds (see next).


### Notification & Human Intervention policy
**When to notify immediately (automated alerts):**
- Authorization failures for account writes (e.g., NetSuite token invalid) — immediate notification to account owner & on-call.
- Persistent 429s causing backlog growth beyond threshold (e.g., > 10k unprocessed events or > 1 hour backlog for high-priority record types).
- High conflict rates (> X% of events) or > N critical conflicts within T minutes.
- Permanent 4xx errors for more than Y occurrences (configurable) on the same account.

**Notification channels:** email + Slack (or chosen comms) to account owner; on-call alert.

**Human resolution process (start to finish):**
1. Conflict queued with full context (Base, our changes, remote current, suggested merged payload, links to audit).
2. Notifier triggers if conflict meets threshold or manual review required.
3. Human visits Conflict UI, reviews the differences, chooses action (Accept SuiteX, Accept NetSuite, Accept suggested merge, or Edit & Apply).
4. When human applies resolution: orchestrator applies resolved change to the target(s) and clears conflict. The `current_state.version` is incremented.
5. While waiting for human resolution, any new events for that record are accepted and appended to the event stream but are blocked from automated apply until conflict resolved (record-level lock). They may be merged into the conflict payload for context.
6. If new events occur while waiting, the conflict UI should show a timeline and allow human to rebase suggested merge onto the latest base (or escalate to higher trust/rollback options).

**Escalation**: If a conflict remains unresolved beyond the human SLA window (e.g., 24 hours), escalate to a higher authority and optionally pause automated writes for that account or mark the field(s) read-only until resolved.


## Handling simultaneous changes while waiting for human intervention
- Lock record for automated applies while in state `IN_CONFLICT`.
- Accept new events and append them to event stream (they will not be auto-applied). When human resolves, orchestrator will compute merge relative to the latest remote snapshot and all queued events.
- Optionally enable a policy to auto-resolve low-risk fields while blocking only high-risk fields (granular field-level locking).


## Backfill & reconciliation
- Periodic full reconciliation job per account or record type (e.g., nightly) that diffs authoritative NetSuite state vs `current_state` projection and emits reconciliation events.
- Provide a customer-initiated resync process and an admin backfill tool to repair projection mismatches.


## Operational runbook (common incidents)

1. **Spike in 429s / governance**
   - Auto-scale down per-account concurrency; pause low-priority record types; alert account owner; schedule backfill.
2. **Persistent auth failure**
   - Immediately notify account owner; mark account suspended for outbound writes; retry token refresh flows; block applies until auth fixed.
3. **Conflict surge**
   - Notify product owner; route non-critical conflicts to auto-resolve policies; escalate critical conflicts.
4. **Event store outage**
   - Failover to standby queue; if unavailable, pause producers and queue locally; alert on-call.


## Security & data protection
- All external writes use TLS and stored secrets are rotated.
- Event and projection stores encrypted at rest.
- Access to conflict UI and escalation controls require RBAC and audit trails.


## Metrics & dashboards
- Events/sec, per-source
- Apply latency (event->applied)
- Conflicts/sec and conflict backlog
- Unprocessed backlog per account
- 429/error rate per account
- Retry counts and dead-letter rate


## Migration plan (from current snapshot-style)
1. Implement event store and producers in parallel with current snapshot pipeline.
2. Start producing delta events from SuiteX (and UE/poller) in addition to snapshots.
3. Implement projection from event store and validate parity with snapshot projection on a canary account.
4. Switch RESTlet workers to consume events and perform three-way merges; keep snapshot pipeline as fallback.
5. Monitor metrics, tune, and gradually widen canary roll-out.
6. Retire snapshot-only pathway after X stable days.


## Acceptance criteria (overall)
- For canary account, achieve consistent data parity for 7 days with event-based sync and <1 minute median apply latency for non-throttled record types.
- Conflict rate below acceptable threshold (configurable) or explained by business rules.
- System handles synthetic load test (configurable target e.g., 10k changes/min) without unbounded backlog growth when throttled.
- No silent failures: all errors flow to DLQ or conflict queue and are visible on dashboard.

---

# SuiteX ↔ NetSuite Sync — Technical Supplement

This supplement contains all implementation-level references designed for AI agent consumption, including event backbone details, sequence diagrams (Mermaid format), orchestration pseudocode, ERDs, schemas, configuration specs, and error taxonomy.

## Appendix A — Durable Event Backbone Architecture

## **A1. Purpose**

This appendix describes the proposed event backbone used to merge, normalize, and deliver change events between SuiteX and NetSuite. The main document defines system behavior; this appendix defines the infrastructure supporting that behavior.

The event backbone provides:

* Durable, replayable event storage
* Ordered delivery per record
* Fan-out to multiple consumers (merging, conflict resolution, sync, reconciliation, DLQ handling)
* Unified ingestion for NetSuite UE events, NetSuite polling events, and SuiteX-originated updates

The backbone is hosted **outside** both systems in our cloud environment.

---

## **A2. Platform Selection**

Given the current hosting on Google Cloud (Ubuntu app servers, Cloud SQL, Redis, GCS, etc.), the recommended durable log is:

**Google Pub/Sub**

Justification:

* Managed service with no cluster maintenance
* Horizontal scaling up to millions of messages
* At-least-once delivery with optional ordering keys
* Dead-letter topics
* Ability to replay messages (seek by time or snapshot)
* IAM-integrated security
* Works seamlessly with Cloud Run, GCE, and Cloud Functions

(Kafka or AWS Kinesis would serve the same role if environments changed.)

---

## **A3. Event Producers**

### **NetSuite UE Producer**

Triggered by User Event scripts (afterSubmit).

* UE sends minimal payload to a SuiteScript RESTlet.
* RESTlet publishes event to Pub/Sub via secure HTTPS endpoint.

### **NetSuite Polling Producer**

Runs every minute from SuiteX.

* Polls `lastmodified`-based windows.
* Emits only record deltas (after normalization).
* Publishes change events to the event bus.

### **SuiteX Producer**

On any SuiteX-originated update (create/update/delete):

* Emits a change event to the bus before or after writing to internal DB (depending on desired consistency model).

All three producers publish to the **same topic**.

---

## **A4. Event Schema**

All events use a common envelope:

```
{
  "recordType": "customer" | "item" | "salesorder" | ...,
  "recordId": "12345",
  "source": "netsuite-ue" | "netsuite-poll" | "suitex",
  "operation": "create" | "update" | "delete",
  "timestamp": "2025-11-14T20:11:00Z",
  "orderingKey": "customer:12345",
  "authContext": {...},
  "payload": {
      // Full or partial change set depending on source
  }
}
```

### Ordering Keys

Using `"orderingKey": "<type>:<id>"` ensures all events for the same record stay in sequence.

---

## **A5. Event Consumers**

Pub/Sub supports multiple independent subscribers. Our architecture needs:

### 1. **Merge / Normalize Consumer**

* Reads all events
* Deduplicates
* Coalesces multiple updates within a small window
* Produces a canonical “merged change event” to an internal topic

### 2. **NetSuite Sync Consumer**

* Consumes merged events requiring writes to NetSuite
* Handles governance-aware batching
* Retries with backoff
* Pushes failures to DLQ + SuiteX human-review workflow

### 3. **SuiteX Sync Consumer**

* Applies NetSuite-originated merged changes to SuiteX
* Idempotent updates
* Triggers rehydration on partial failures

### 4. **Reconciliation / Backfill Consumer**

* Runs in background
* Scans for drift
* Writes “correction events” back into the bus

### 5. **DLQ Handler**

* Consumes Pub/Sub dead-letter topic
* Escalates via SuiteX UI notifications (errors requiring human review)
* Locks the affected record to prevent further writes while manual resolution is pending

---

## **A6. Persistence Layer**

Beyond Pub/Sub itself:

### **Cloud SQL**

* Stores sync watermarks
* Tracks polling state
* Logs conflict-resolution decisions
* Stores human-review queues
* Maintains record-level locks during manual intervention

### **GCS/BigQuery** (optional)

* Stores all historic events for audit
* Analytics on sync patterns, volume, failure rates

---

## **A7. Security & IAM**

* Producers use Pub/Sub HTTPS push endpoints with restricted service accounts
* Least-privilege IAM roles
* All event payloads encrypted in transit (TLS 1.2/1.3) and at rest (GCP KMS-managed)
* Access logs stored in Cloud Logging

---

## **A8. Failure Handling**

### **At the Event Producer**

* If NetSuite UE cannot send event → UE logs local error, retries via RESTlet, fallback to polling detection
* If polling producer fails → next iteration recovers using watermark checkpoint

### **At the Consumer**

* Automatic retry with exponential backoff
* DLQ after threshold exceeded
* Manual intervention process initiated

### **Manual-Review Workflow**

1. Consumer writes failure context to SuiteX “Sync Error” table
2. Record locked from further writes (except UI override)
3. Notification issued via email / in-app notification
4. User resolves conflict or corrects data
5. Unlock flow publishes “resolution event” back into Pub/Sub
6. Sync resumes

---

## **A9. Why This Backbone Is Required**

* UE events are fast but incomplete
* Polling is reliable but slow and high-volume
* SuiteX events happen independently
* You need a single timeline for each record
* NetSuite’s governance limitations require intelligent retry & batching
* Event replay for audits and fixes is only possible with a durable log

Without an external durable log, conflict resolution and merging logic are fragile or impossible.

---

## **A10. Summary**

The durable event backbone is the infrastructure layer enabling enterprise-grade sync behavior between SuiteX and NetSuite:

* Hosted in GCP (not in SuiteX or NetSuite)
* Captures all changes from both systems
* Merges, deduplicates, and sequences events
* Enables consistent, low-latency bidirectional sync
* Supports error recovery, manual review, and replay

## Appendix B — Mermaid Sequence Diagrams

### B1. NetSuite UE → Event Backbone → Merge → Sync

```mermaid
sequenceDiagram
    participant UE as NetSuite UE Script
    participant RL as RESTlet Endpoint
    participant EB as Event Backbone (Pub/Sub)
    participant MG as Merge Service
    participant SX as SuiteX Sync Service
    participant NS as NetSuite Sync Service

    UE->>RL: POST Change Event
    RL->>EB: Publish Event
    EB->>MG: Deliver Event
    MG->>MG: Normalize & Merge
    MG->>SX: Emit SuiteX-bound Updates
    MG->>NS: Emit NetSuite-bound Updates
    SX->>SX: Apply to SuiteX
    NS->>NS: Apply to NetSuite
```

### B2. Polling Cycle

```mermaid
sequenceDiagram
    participant PL as SuiteX Poller
    participant NS as NetSuite Records
    participant EB as Event Backbone

    PL->>NS: Query lastmodified >= watermark
    NS->>PL: Return changed records
    PL->>EB: Publish delta events
    EB->>MG: Merge as normal
```

### B3. SuiteX → NetSuite Write Cycle

```mermaid
sequenceDiagram
    participant SX as SuiteX App
    participant EB as Event Backbone
    participant MG as Merge Service
    participant NS as NetSuite Sync Service

    SX->>EB: Publish create/update/delete event
    EB->>MG: Deliver
    MG->>NS: Emit canonical NetSuite write
    NS->>NS: Perform governance-aware write
```

### B4. Conflict Handling

```mermaid
sequenceDiagram
    participant MG as Merge Service
    participant CQ as Conflict Queue
    participant UI as SuiteX Admin UI
    participant EB as Event Backbone

    MG->>CQ: Submit conflict record
    UI->>CQ: Fetch conflict
    UI->>UI: Human resolution
    UI->>EB: Publish resolution event
    EB->>MG: Resume merging
```

## Appendix C — Canonical Data Models (JSON Schema)

### C1. ChangeEvent Schema

```json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "ChangeEvent",
  "type": "object",
  "required": ["recordType", "recordId", "source", "operation", "timestamp", "orderingKey", "payload"],
  "properties": {
    "recordType": {"type": "string"},
    "recordId": {"type": "string"},
    "source": {"enum": ["netsuite-ue", "netsuite-poll", "suitex"]},
    "operation": {"enum": ["create", "update", "delete"]},
    "timestamp": {"type": "string", "format": "date-time"},
    "orderingKey": {"type": "string"},
    "payload": {"type": "object"}
  }
}
```

### C2. MergedEvent Schema

```json
{
  "title": "MergedEvent",
  "type": "object",
  "required": ["recordType", "recordId", "changes", "sourceEvents"],
  "properties": {
    "recordType": {"type": "string"},
    "recordId": {"type": "string"},
    "changes": {"type": "object"},
    "sourceEvents": {
      "type": "array",
      "items": {"$ref": "#/definitions/changeEvent"}
    }
  }
}
```

### C3. ConflictRecord Schema

```json
{
  "title": "ConflictRecord",
  "type": "object",
  "properties": {
    "recordType": {"type": "string"},
    "recordId": {"type": "string"},
    "field": {"type": "string"},
    "suiteXValue": {},
    "netSuiteValue": {},
    "timestamp": {"type": "string", "format": "date-time"}
  }
}
```

## Appendix D — ERD (Mermaid Format)

```mermaid
erDiagram
    sync_watermark {
        string record_type
        datetime last_polled
        datetime last_event_seen
    }

    event_audit_log {
        string event_id
        string record_type
        string record_id
        string source
        json payload
        datetime created_at
    }

    sync_error_queue {
        string id
        string record_type
        string record_id
        string reason
        json details
        datetime created_at
        string status
    }

    record_lock {
        string record_type
        string record_id
        boolean locked
        datetime locked_at
        string reason
    }
```

## Appendix E — Topic & Consumer Map

### Topics

* `events.raw` — UE, polling, SuiteX producers
* `events.merged` — output of merge service
* `events.error` — retry-able errors
* `events.dlq` — dead-letter

### Consumers

* `merge-service` — normalizes and produces canonical merged events
* `netsuite-writer` — governance-aware writing
* `suitex-writer` — apply NS → SX updates
* `reconciliation-service` — background drift correction
* `error-handler` — retries + escalation

## Appendix F — Orchestrator Pseudocode

### F1. Merge Orchestrator

```
function processEvent(event):
    key = event.recordType + ":" + event.recordId

    existing = readPendingChanges(key)

    if existing:
        merged = merge(existing, event)
    else:
        merged = event

    writePendingChanges(key, merged)

    if shouldFlush(key):
        emitMergedEvent(merged)
        clearPending(key)
```

### F2. NetSuite Writer

```
function writeToNetSuite(mergedEvent):
    if recordLocked(mergedEvent.recordId):
        queueForLater(mergedEvent)
        return

    try:
        response = netsuiteApi.update(mergedEvent)
    catch GovernanceLimit:
        retryWithBackoff(mergedEvent)
    catch ValidationError e:
        pushToErrorQueue(mergedEvent, e)
    catch Exception e:
        sendToDLQ(mergedEvent, e)
```

### F3. Human-Intervention Flow

```
onErrorQueueItem(item):
    createLock(item.recordId)
    notifyAdmins(item)

adminResolvesConflict(item, resolution):
    applyResolution(item, resolution)
    publishResolutionEvent(item)
    releaseLock(item.recordId)
```

## Appendix G — Error Taxonomy

### Transient Errors

* NetSuite governance exceeded
* Timeout
* Network issues

### Validation Errors

* Required fields missing
* Business rules violated

### Conflict Errors

* SuiteX vs NetSuite concurrent update collision

### Irrecoverable Errors

* Record deleted but referenced
* Corrupted payload

## Appendix H — Governance-Aware NetSuite Policies

* Maximum batch size: 10 writes/operation
* Backoff schedule: 5s, 15s, 30s, 60s, 5m
* Switch to queued writes after 3 consecutive failures
* Cap governance at 50% of daily budget per integration

## Appendix I — Configuration & Environment Spec

* `PUBSUB_TOPIC_RAW`
* `PUBSUB_TOPIC_MERGED`
* `NETSUITE_RESTLET_URL`
* `NETSUITE_ACCOUNT`
* `NETSUITE_SCRIPT_ID`
* `NETSUITE_DEPLOY_ID`
* `POLL_INTERVAL_SECONDS`
* `MAX_RETRY_ATTEMPTS`
* `GOVERNANCE_THRESHOLD`

---

## Appendix J — NetSuite User Event (UE) Script Implementation Details

### J1. UE Script Trigger Semantics

NetSuite User Event scripts support three trigger points:

**beforeLoad**
- Executes when a record is loaded for viewing or editing in the UI
- **Not used** for sync events (read-only context)

**beforeSubmit**
- Executes after user clicks Save but before commit to database
- Can access both old and new field values via `oldRecord` and `newRecord`
- **Best for capturing deltas**: Compare old vs new values
- Limitations:
  - Must complete within governance limits (see J5)
  - Cannot make external HTTP calls reliably (governance risk)
  - Should only stage events, not publish directly

**afterSubmit**
- Executes after successful database commit
- **Primary trigger for event emission**
- Can safely make RESTlet calls to publish events
- Access to context: `type` (create, edit, delete, xedit, approve, reject, cancel, etc.)

### J2. UE Script Context Types

NetSuite provides `context.type` to distinguish event origins:

| Context Type | Description | Sync Relevance |
|--------------|-------------|----------------|
| `UserEventType.CREATE` | User/API created record | Sync |
| `UserEventType.EDIT` | User/API edited record | Sync |
| `UserEventType.DELETE` | User/API deleted record | Sync |
| `UserEventType.XEDIT` | Cross-subsidiary edit | Sync |
| `UserEventType.APPROVE` | Workflow approval | Sync |
| `UserEventType.REJECT` | Workflow rejection | Sync |
| `UserEventType.CANCEL` | Transaction cancelled | Sync |
| `UserEventType.PACK` | Item fulfillment pack | Sync (if monitored) |
| `UserEventType.SHIP` | Item fulfillment ship | Sync (if monitored) |
| `UserEventType.MARKCOMPLETE` | Transaction marked complete | Sync (if monitored) |
| `UserEventType.REASSIGN` | Case reassignment | Sync (if monitored) |
| `UserEventType.EDITFORECAST` | Forecast edit | Conditional |
| `UserEventType.COPY` | Record copy | Create event |

**Recommendation**: Emit events for CREATE, EDIT, DELETE, XEDIT, APPROVE, REJECT, CANCEL. Configure others per record type.

### J3. UE Script Field Access Patterns

#### Accessing Changed Fields (beforeSubmit)

```javascript
/**
 * @param {Object} context
 * @param {Record} context.oldRecord
 * @param {Record} context.newRecord
 */
function beforeSubmit(context) {
    const oldRec = context.oldRecord;
    const newRec = context.newRecord;

    // Iterate all fields to detect changes
    const fields = newRec.getFields();
    const changes = {};

    fields.forEach(function(fieldId) {
        const oldVal = oldRec.getValue({fieldId: fieldId});
        const newVal = newRec.getValue({fieldId: fieldId});

        if (oldVal !== newVal) {
            changes[fieldId] = {
                old: oldVal,
                new: newVal
            };
        }
    });

    // Store changes for afterSubmit
    context.stageChanges = JSON.stringify(changes);
}
```

#### Emitting Events (afterSubmit)

```javascript
/**
 * @param {Object} context
 * @param {Record} context.newRecord
 * @param {string} context.type
 */
function afterSubmit(context) {
    const recordType = context.newRecord.type;
    const recordId = context.newRecord.id;

    // Retrieve staged changes
    const changes = context.stageChanges ?
        JSON.parse(context.stageChanges) :
        captureAllFields(context.newRecord);

    const event = {
        recordType: recordType,
        recordId: recordId,
        source: 'netsuite-ue',
        operation: mapContextType(context.type),
        timestamp: new Date().toISOString(),
        orderingKey: recordType + ':' + recordId,
        payload: changes
    };

    // Publish via RESTlet or queue locally
    publishEvent(event);
}
```

### J4. UE Script Deployment Constraints

- **One UE script per record type** (customer, item, salesorder, etc.)
- Deployments can be restricted by:
  - Role
  - Employee
  - Subsidiary
  - Execution context (UI, CSV, webservices, scheduled, etc.)
- **Execution context filtering** is critical:
  - `ExecutionContext.USER_INTERFACE`: UI-driven changes
  - `ExecutionContext.WEBSERVICES`: REST/SOAP API calls
  - `ExecutionContext.SCHEDULED`: Scheduled scripts
  - `ExecutionContext.CSV_IMPORT`: CSV imports
  - `ExecutionContext.MAPREDUCE`: Map/Reduce scripts

**Recommendation**: Deploy UE to ALL execution contexts to ensure completeness, but filter out scheduled sync jobs by checking script ID in context.

### J5. UE Script Governance Limits

NetSuite enforces per-script governance limits:

| Resource | Limit (Standard) | Notes |
|----------|------------------|-------|
| Total script execution time | 10,000 units (~1 second) | afterSubmit must complete quickly |
| Database queries | 10,000 units | Minimize search calls |
| HTTP requests | 10 (User Event) | Cannot make multiple external calls |
| File operations | 10,000 units | Avoid in UE |
| SSS_TIME_LIMIT_EXCEEDED | Fatal error | Script terminates, transaction may rollback |

**Critical constraint**: afterSubmit UE scripts have **very limited HTTP request allowance**. Best practices:

1. **Option A**: Call a lightweight internal RESTlet that queues events (preferred)
2. **Option B**: Write to a custom record table; separate scheduled script publishes to SuiteX
3. **Option C**: Use N/task module to spawn Map/Reduce or Scheduled Script (complex)

### J6. UE Script Error Handling

Errors in UE scripts can have severe consequences:

- **beforeSubmit errors**: Abort the save operation; user sees error
- **afterSubmit errors**: Transaction already committed; error logged but record saved
- **Best practice**: Wrap all external calls in try/catch; log failures to custom record; rely on polling as fallback

```javascript
function afterSubmit(context) {
    try {
        const event = buildEvent(context);
        publishEvent(event);
    } catch (e) {
        log.error({
            title: 'Event Publish Failed',
            details: {
                recordType: context.newRecord.type,
                recordId: context.newRecord.id,
                error: e.message
            }
        });
        // DO NOT throw - would rollback transaction in some contexts
        // Rely on polling as fallback
    }
}
```

### J7. UE Script Idempotency & Race Conditions

**Challenge**: Multiple UE scripts can fire for a single logical change:
- Workflow script updates field → UE fires
- UE script updates related record → UE fires on related record
- Infinite loop risk

**Mitigation**:
1. Check execution context to detect script-originated changes
2. Set a custom flag field (e.g., `custbody_suitex_update_id`) on write from SuiteX
3. UE script checks flag; if present, skip event emission
4. Polling layer clears flag and emits authoritative event

---

## Appendix K — NetSuite Field-Level Considerations

### K1. Field Type Taxonomy

NetSuite supports diverse field types with different serialization and comparison semantics:

| Field Type | getValue() Returns | Notes |
|------------|-------------------|-------|
| **Text** (`text`) | String | Direct comparison |
| **Textarea** (`textarea`) | String | May contain newlines, HTML |
| **Integer** (`integer`) | Number | Direct comparison |
| **Float** (`float`) | Number | Use epsilon for comparison |
| **Currency** (`currency`) | Number | Locale-specific formatting |
| **Date** (`date`) | String (ISO or locale) | Normalize to ISO8601 |
| **Datetime** (`datetime`) | String | Timezone-aware; normalize to UTC |
| **Checkbox** (`checkbox`) | Boolean (`true`/`false`) | Stored as `T`/`F` in some APIs |
| **Select** (`select`) | Internal ID (string) | Need lookup for display value |
| **Multiselect** (`multiselect`) | Array of internal IDs | Order may vary; normalize |
| **Record** (`record`) | Internal ID | Foreign key reference |
| **File** (`file`) | File internal ID | Binary not in event; store ref |
| **Email** (`email`) | String | Validate format |
| **Phone** (`phone`) | String | Normalize formatting |
| **URL** (`url`) | String | Validate format |
| **Percent** (`percent`) | Number (0-100) | Display as % |
| **Time of Day** (`timeofday`) | String (`HH:MM am/pm`) | Normalize to 24h |
| **Rich Text** (`richtext`) | HTML String | Strip tags for comparison |
| **Inline HTML** (`inlinehtml`) | HTML String | Display-only; may not sync |
| **Image** (`image`) | URL or file ID | Reference only |

### K2. Multiselect Field Handling

Multiselect fields return arrays of internal IDs. Key considerations:

**Order is not guaranteed**:
```javascript
// Bad: Direct comparison
oldValue: ['123', '456']
newValue: ['456', '123']  // Same values, different order

// Good: Normalize before comparison
function normalizeMultiselect(value) {
    if (!Array.isArray(value)) return [];
    return value.sort();
}
```

**getValue vs getText**:
- `getValue()`: Returns internal IDs (use for sync)
- `getText()`: Returns display values (use for UI/logging)

**Empty multiselect**:
- Can be `[]`, `null`, or `['']` depending on context
- Normalize to `[]` before comparison

### K3. Checkbox Field Handling

NetSuite checkbox fields have inconsistent representations:

| Context | Checked | Unchecked |
|---------|---------|-----------|
| N/record | `true` | `false` |
| REST API | `true` | `false` |
| SOAP API | `"T"` | `"F"` |
| SuiteScript getValue() | `"T"` | `"F"` |
| Search results | `true` | `false` |

**Normalization required**:
```javascript
function normalizeCheckbox(value) {
    if (value === true || value === 'T' || value === 't') return true;
    if (value === false || value === 'F' || value === 'f' || value === null) return false;
    return false; // default
}
```

### K4. Date/Datetime Field Handling

**Date fields**:
- Stored as date-only (no time component)
- User preference determines format display
- Best practice: Normalize to `YYYY-MM-DD` in events

**Datetime fields**:
- Stored in account timezone
- Retrieved in user timezone (context-dependent)
- Best practice: Normalize to ISO8601 UTC in events

```javascript
function normalizeDate(dateValue) {
    if (!dateValue) return null;
    const dt = new Date(dateValue);
    return dt.toISOString().split('T')[0]; // YYYY-MM-DD
}

function normalizeDatetime(datetimeValue) {
    if (!datetimeValue) return null;
    return new Date(datetimeValue).toISOString(); // Full ISO8601
}
```

### K5. Record Reference Fields

Record reference fields store internal IDs. Considerations:

**Type information**:
- `getValue({fieldId: 'customer'})` returns `"12345"`
- Need to know field type to resolve referenced record type

**Null vs Empty**:
- Cleared reference returns `null` or `""`
- Normalize to `null`

**Circular references**:
- Customer → Project → Customer
- Do not embed full referenced record; use ID only
- SuiteX must resolve references separately

### K6. Sublist (Line-Level) Fields

Transactions and some records have sublists (e.g., salesorder lines):

**Access pattern**:
```javascript
const lineCount = record.getLineCount({sublistId: 'item'});
for (let i = 0; i < lineCount; i++) {
    const itemId = record.getSublistValue({
        sublistId: 'item',
        fieldId: 'item',
        line: i
    });
}
```

**Change detection**:
- Must compare entire sublist
- Lines can be added, removed, or reordered
- Use line-level unique keys (item + line ID) to match across versions

**Sync strategy**:
- Emit full sublist snapshot on any line change (simpler)
- OR: Emit line-level deltas with operation (add/update/delete line)

### K7. Body vs Line Fields

NetSuite distinguishes:
- **Body fields**: `custbody_*`, `tranid`, `entity`, etc.
- **Line fields**: `custcol_*`, `item`, `quantity`, `rate`, etc.

UE scripts must capture both:
- Body: `record.getValue({fieldId})`
- Lines: `record.getSublistValue({sublistId, fieldId, line})`

**Custom field namespaces**:
- Body custom fields: `custbody_<scriptid>`
- Column custom fields: `custcol_<scriptid>`
- Entity custom fields: `custentity_<scriptid>`
- Item custom fields: `custitem_<scriptid>`
- CRM custom fields: `custevent_<scriptid>`

### K8. Formula and Computed Fields

Some fields are read-only computed values:
- Formula fields (e.g., `formulatext`, `formulanumeric`)
- System-calculated fields (e.g., `total` on transactions)

**Best practice**:
- Do not include in outbound change events
- Mark as read-only in field mappings
- SuiteX should never attempt to write these fields

### K9. Field-Level Access Control

NetSuite fields have permission levels:
- View
- Edit
- None (hidden)

**Considerations**:
- UE script runs with script's role permissions
- May not have access to all fields user can see
- Must handle `SSS_INVALID_FIELD_ID` errors gracefully

---

## Appendix L — NetSuite Governance Limits & Operational Constraints

### L1. API Request Limits

NetSuite enforces strict governance on API usage:

**Concurrent Request Limits (RESTlet/SOAP/REST)**:
| Tier | Concurrent Requests | Requests/Hour |
|------|---------------------|---------------|
| Standard | 1 (can burst to 5) | ~1,000 |
| Premium | 5 (can burst to 10) | ~5,000 |
| Premium Plus | 10 (can burst to 25) | ~10,000 |

**Rate limiting**:
- NetSuite returns `429 Too Many Requests` when limit exceeded
- Retry-After header may be present (often not)
- Recommended backoff: 60 seconds, then exponential

**Governance units**:
- Each RESTlet invocation consumes governance units
- Units consumed depend on operations performed inside RESTlet
- No external visibility into unit consumption mid-request

**Best practices**:
1. Never exceed 50% of concurrent limit sustained
2. Batch operations where possible (10-50 records per call)
3. Use queuing and adaptive throttling (see Appendix H)
4. Monitor 429 rates; adjust concurrency dynamically

### L2. Script Execution Time Limits

NetSuite scripts have hard time limits:

| Script Type | Time Limit | Governance Units |
|-------------|------------|------------------|
| User Event | 10,000 units (~10 sec) | 10,000 |
| Client Script | 10,000 units | 10,000 |
| RESTlet | 10,000 units | 10,000 |
| Suitelet | 10,000 units | 10,000 |
| Scheduled Script | 10,000 units per invocation | 10,000/invocation, auto-yields |
| Map/Reduce | 10,000 units per stage | Unlimited total (yields between stages) |

**Exceeding limits**:
- Throws `SSS_TIME_LIMIT_EXCEEDED`
- Script terminates immediately
- Partial work may be committed (transaction-dependent)

**Mitigation**:
- RESTlets should delegate to queues/scheduled scripts for long operations
- Use Map/Reduce for bulk operations (auto-yields)
- UE scripts must be lightweight (< 1 second typical)

### L3. Search Result Limits

**SuiteScript search**:
- Default: 1,000 results per search
- `Search.runPaged()`: Up to 1,000 pages × 1,000 results = 1,000,000 (but governance-intensive)
- **Polling implication**: Must use cursor-based pagination with `lastModifiedDate` + `internalId` to page through results

**Best practice for polling**:
```javascript
// Cursor-based polling
const search = N.search.create({
    type: 'customer',
    filters: [
        ['lastmodifieddate', 'after', watermark.date],
        'OR',
        [
            ['lastmodifieddate', 'is', watermark.date],
            'AND',
            ['internalid', 'greaterthan', watermark.id]
        ]
    ],
    columns: ['internalid', 'lastmodifieddate']
});

const pagedData = search.runPaged({pageSize: 1000});
// Process page by page
```

### L4. Transaction Throttling

NetSuite may throttle accounts exhibiting:
- High error rates
- Excessive concurrent connections
- Unusual access patterns

**Indicators**:
- Intermittent 429s even below stated limits
- Slow response times (> 10 seconds)
- Support notifications

**Mitigation**:
- Implement circuit breaker pattern
- Back off to 10% of normal rate on sustained errors
- Contact NetSuite support if persistent

### L5. RESTlet-Specific Constraints

**Request size limits**:
- POST body: ~10 MB
- Response: ~10 MB
- **Implication**: Cannot fetch/send extremely large records; use pagination

**Timeouts**:
- NetSuite imposes 5-minute hard timeout on RESTlet execution
- If RESTlet doesn't respond within 5 min, client gets timeout error
- RESTlet may still be running (and consuming governance)

**Concurrency**:
- Multiple requests to same RESTlet deployment count against account-wide concurrent limit
- Deploy multiple RESTlet deployments to increase effective concurrency (each deployment has separate queue)

### L6. SOAP API Constraints

**SOAP Web Services**:
- Older API, more stable but verbose
- Operations: `get`, `getList`, `search`, `add`, `update`, `delete`, `upsert`
- Concurrent request limits same as RESTlets
- Better error messages than REST (often)

**SOAP vs RESTlet**:
- SOAP: Better for bulk gets (`getList` up to 1,000 records)
- RESTlet: More flexible, custom logic
- **Recommendation**: Use SOAP for polling (getList); RESTlet for event emission and custom merges

### L7. REST API (SuiteTalk REST) Constraints

**NetSuite REST API** (not RESTlet):
- Native REST endpoints (e.g., `/record/v1/customer`)
- OAuth 1.0 or Token-based auth
- Concurrent limits same as SOAP/RESTlet
- More restrictive field access than SuiteScript (some fields not exposed)

**Limitations**:
- No custom logic (vs RESTlet)
- Limited sublist operations
- Some record types not supported

**Use cases**:
- Simpler integration if SuiteScript not needed
- Preferred for SaaS-to-SaaS sync (lower maintenance)

### L8. CSV Import Governance

CSV imports do not count against API limits but:
- Queued for processing (can take minutes to hours)
- No real-time feedback
- Error handling requires polling job status
- **Not suitable for real-time sync**

---

## Appendix M — NetSuite Polling Implementation Details

### M1. Polling Strategy

**Objective**: Detect all changes missed by UE events with minimal governance cost.

**Approach**: Two-phase polling:
1. **Discovery phase**: Query for changed record IDs and timestamps
2. **Detail phase**: Fetch full field values for changed records (batched)

### M2. Discovery Query Pattern

**Saved search or dynamic search**:
```javascript
const search = N.search.create({
    type: recordType,
    filters: [
        ['lastmodifieddate', 'onorafter', watermark.timestamp],
        'AND',
        ['isinactive', 'is', 'F'] // Exclude inactive unless monitoring deletes
    ],
    columns: [
        {name: 'internalid'},
        {name: 'lastmodifieddate', sort: N.search.Sort.ASC}
    ]
});

const results = [];
search.run().each(function(result) {
    results.push({
        id: result.getValue('internalid'),
        lastModified: result.getValue('lastmodifieddate')
    });
    return results.length < 1000; // Stop at 1k per batch
});
```

**Cursor persistence**:
```javascript
const watermark = {
    recordType: 'customer',
    timestamp: '2025-11-14T10:00:00Z',
    lastInternalId: '12345'
};
// Store in Cloud SQL sync_watermark table
```

**Next iteration**:
```javascript
['lastmodifieddate', 'after', watermark.timestamp],
'OR',
[
    ['lastmodifieddate', 'is', watermark.timestamp],
    'AND',
    ['internalid', 'greaterthan', watermark.lastInternalId]
]
```

### M3. Detail Fetch Pattern

**Option A: SOAP getList (preferred for efficiency)**
```javascript
const ids = ['123', '456', '789']; // Up to 1,000
const records = N.https.post({
    url: 'https://account.suitetalk.api.netsuite.com/services/NetSuitePort',
    headers: {
        'Content-Type': 'text/xml',
        'SOAPAction': 'getList'
    },
    body: buildSOAPGetListRequest(recordType, ids)
});
// Parse SOAP response
```

**Option B: Individual N/record.load calls** (higher governance cost)
```javascript
ids.forEach(function(id) {
    try {
        const record = N.record.load({
            type: recordType,
            id: id,
            isDynamic: false
        });
        const snapshot = captureAllFields(record);
        emitEvent(snapshot);
    } catch (e) {
        log.error('Failed to load record', {id: id, error: e.message});
    }
});
```

### M4. Polling Cadence

**Recommended cadence by record type**:

| Record Type | Polling Interval | Rationale |
|-------------|------------------|-----------|
| Customer, Contact | 5 minutes | High change rate, low volume |
| Item | 10 minutes | Medium change rate |
| Sales Order, Invoice | 2 minutes | High change rate, high sync priority |
| Purchase Order | 5 minutes | Medium change rate |
| Vendor | 10 minutes | Low change rate |
| Employee | 15 minutes | Low change rate, compliance |
| Project, Task | 5 minutes | High collaboration, medium volume |
| Custom Records | 5-15 minutes | Configurable per record |

**Dynamic adjustment**:
- If discovery query returns 0 results for N consecutive polls, increase interval by 50%
- If discovery query consistently returns max results (1,000), decrease interval
- Cap minimum at 1 minute, maximum at 1 hour

### M5. Polling Failure Handling

**Failure scenarios**:
1. Search fails (SSS_TIME_LIMIT_EXCEEDED): Reduce batch size, retry
2. Detail fetch fails for specific ID: Log, skip, retry next cycle
3. NetSuite returns 429: Exponential backoff, 60s → 5m → 15m
4. Network timeout: Retry immediately (transient)

**Fallback**:
- UE events provide early detection
- Polling provides authoritative confirmation
- If polling is down > 15 minutes, alert on-call

### M6. Deleted Record Detection

**Challenge**: `lastmodifieddate` does not update on delete.

**Options**:
1. **getDeleted API** (SOAP only):
   ```xml
   <getDeleted>
       <getDeletedFilter>
           <type>customer</type>
           <deletedDate operator="after">2025-11-14T10:00:00</deletedDate>
       </getDeletedFilter>
   </getDeleted>
   ```
   Returns list of deleted record IDs and deletion timestamps.

2. **Reconciliation scans**:
   - Nightly: Fetch all active record IDs
   - Diff against `current_state` projection
   - Missing records presumed deleted
   - Emit delete events

**Recommendation**: Use `getDeleted` for near-real-time; reconciliation as fallback.

### M7. Polling Deduplication

**Problem**: Polling may detect same change multiple times (lastModified updated by subsequent changes within polling window).

**Solution**:
- Track `(recordType, recordId, lastModifiedDate)` hash in `event_audit_log`
- Before emitting event, check if already emitted
- If UE event already emitted for same timestamp, skip polling event

```javascript
const eventFingerprint = crypto.createHash('sha256')
    .update(`${recordType}:${recordId}:${lastModified}`)
    .digest('hex');

if (alreadyProcessed(eventFingerprint)) {
    log.debug('Skipping duplicate', {recordId, lastModified});
    return;
}

emitEvent(event);
recordFingerprint(eventFingerprint);
```

### M8. Polling Governance Budget

**Estimate governance cost per poll cycle**:
- Discovery search: ~100 units per 1,000 results
- Detail fetch (N/record.load): ~10 units per record
- For 100 changed records: ~100 + (100 × 10) = ~1,100 units

**Daily budget**:
- Scheduled script: 10,000 units per invocation, can reschedule
- For 100 records/cycle, 1-minute cadence: 1,440 cycles/day = ~1.5M units/day
- **Requires Map/Reduce** (auto-yields) or multiple scheduled script deployments

**Alternative**: Run discovery every minute; detail fetch in separate scheduled script with queue

---

## Appendix N — NetSuite Event Ordering Constraints & Guarantees

### N1. NetSuite Internal Ordering

**What NetSuite guarantees**:
- Within a single transaction, field updates are atomic
- `lastmodifieddate` reflects commit timestamp (second precision)
- `internalid` is monotonically increasing per record type (with gaps)

**What NetSuite does NOT guarantee**:
- Order of UE script execution if multiple UE scripts deployed
- Order of workflow actions vs UE scripts
- Consistent ordering across concurrent user sessions
- Sub-second timestamp resolution

### N2. UE Event Ordering Challenges

**Scenario**: User edits Customer #123 twice within 1 second:
- Edit 1: Change email to `a@example.com`
- Edit 2: Change email to `b@example.com`

**Problem**:
- Both afterSubmit UE events may have same `lastmodifieddate`
- If events published to unordered queue, may arrive out of order
- SuiteX may apply Edit 2, then Edit 1 (incorrect final state)

**Mitigation**:
1. **Pub/Sub ordering keys**: Use `recordType:recordId` as ordering key (see Appendix A4)
2. **Event sequence numbers**: UE script reads current projection version, includes in event
3. **Merge service coalescing**: Buffer events for 5-10 seconds, apply latest only

### N3. Polling Ordering Guarantees

**Saved search ordering**:
- Results sorted by `lastmodifieddate ASC, internalid ASC`
- Guarantees processing in modification order (second precision)
- Multiple changes within same second processed in ID order (arbitrary but consistent)

**Cursor-based pagination**:
- Ensures no records skipped
- Handles records modified during polling cycle (may appear in next cycle)

### N4. Cross-Record Ordering

**Challenge**: Transaction creates Customer and Sales Order simultaneously:
- Customer created (ID 5001, timestamp 10:00:00)
- Sales Order created (ID 3001, timestamp 10:00:00, references Customer 5001)

**Problem**:
- Polling may return Sales Order before Customer (processed in parallel)
- SuiteX attempts to create Sales Order referencing non-existent Customer (fails)

**Mitigation**:
1. **Dependency graph**: SuiteX knows Customer must exist before Sales Order
2. **Retry queue**: If foreign key constraint fails, queue Sales Order event for retry
3. **Reconciliation**: Nightly pass resolves orphaned references
4. **Ordered record types**: Process Customers before Transactions in polling schedule

### N5. SuiteX → NetSuite Write Ordering

**Challenge**: SuiteX user updates Customer #123 email and phone simultaneously:
- Two events emitted to Pub/Sub
- NetSuite writer consumes both concurrently
- Both attempt to update same Customer

**Problem**:
- Race condition: Last writer wins
- May lose one change

**Mitigation**:
1. **Record-level locks in SuiteX**: Serialize writes per record
2. **Merge before write**: Coalesce both changes into single NetSuite update
3. **Optimistic concurrency**: Include version; if mismatch, retry with latest

### N6. Ordering Guarantees Summary

| Layer | Guarantee | Mechanism |
|-------|-----------|-----------|
| NetSuite storage | Atomic per transaction | Database ACID |
| NetSuite API | No cross-record ordering | N/A |
| UE event emission | No ordering | Async RESTlet calls |
| Pub/Sub delivery | Per-key ordering | Ordering key |
| SuiteX merge | In-order per record | Stateful consumer |
| NetSuite write | Serialized per record | SuiteX lock |

**Overall**: System guarantees correct final state per record but not global ordering across records.

---

## Appendix O — NetSuite Retry Behavior & Limitations

### O1. RESTlet Retry Semantics

**NetSuite behavior**:
- RESTlet execution is **not idempotent by default**
- If RESTlet times out or returns 5xx, NetSuite may retry internally (undocumented)
- Client retries are **not deduplicated** by NetSuite

**Implication**:
- Must implement idempotency keys in RESTlet logic
- Store `(idempotencyKey, result)` in custom record
- On retry, check if already processed; return cached result

```javascript
/**
 * RESTlet POST handler
 */
function post(context) {
    const idempotencyKey = context.idempotencyKey;

    // Check if already processed
    const existing = checkIdempotencyKey(idempotencyKey);
    if (existing) {
        return existing.result;
    }

    // Process request
    const result = processRequest(context);

    // Store idempotency key
    storeIdempotencyKey(idempotencyKey, result);

    return result;
}
```

### O2. SOAP Retry Behavior

**SOAP API**:
- Stateless
- No built-in idempotency
- Client must implement retry logic with exponential backoff

**Recommended backoff**:
```javascript
const backoff = [1000, 2000, 5000, 15000, 30000, 60000]; // ms
let attempt = 0;

while (attempt < backoff.length) {
    try {
        const response = callSOAP(request);
        return response;
    } catch (e) {
        if (isTransientError(e)) {
            await sleep(backoff[attempt]);
            attempt++;
        } else {
            throw e; // Permanent error
        }
    }
}
```

### O3. Transient vs Permanent Errors

**Transient** (retry):
- `429 Too Many Requests`
- `500 Internal Server Error`
- `503 Service Unavailable`
- Network timeouts
- `SSS_REQUEST_LIMIT_EXCEEDED`

**Permanent** (do not retry):
- `400 Bad Request`
- `401 Unauthorized`
- `403 Forbidden`
- `404 Not Found`
- `SSS_INVALID_RECORD_TYPE`
- `INSUFFICIENT_PERMISSION`
- `INVALID_KEY_OR_REF` (bad foreign key)

**Ambiguous** (retry with caution):
- `504 Gateway Timeout`: May have succeeded; check idempotency
- `SSS_TIME_LIMIT_EXCEEDED`: Script may have partially completed

### O4. Governance-Induced Failures

**SSS_USAGE_LIMIT_EXCEEDED**:
- Account has exceeded daily governance budget
- **Cannot retry until next day** (or governance reset)
- Must queue operations and process in next window

**SSS_REQUEST_LIMIT_EXCEEDED**:
- Concurrent request limit exceeded
- Retry after 60 seconds
- Adaptive throttling required (see Appendix H)

### O5. Partial Success Handling

**Scenario**: Batch update of 50 records via RESTlet:
- Records 1-30 succeed
- Record 31 fails (validation error)
- Records 32-50 not processed (script terminated)

**Challenge**: Partial batch processed; need to resume.

**Solutions**:
1. **Idempotent records**: Each record update includes idempotency key; retry entire batch (successful records skip)
2. **Checkpoint-based**: RESTlet persists progress; returns `{processed: 30, failed: [31], remaining: 19}`; client resumes from 31
3. **Avoid batching**: Update records individually (higher governance cost but simpler error handling)

**Recommendation**: Individual updates for critical records; batching for bulk background sync.

### O6. SuiteX Retry Policy

**SuiteX orchestrator retry policy** (for NetSuite writes):

| Error Class | Max Retries | Backoff | Queue |
|-------------|-------------|---------|-------|
| Transient | 10 | Exponential (1s to 5m) | Retry queue |
| 429 Rate Limit | 5 | 60s, 5m, 15m, 1h, 4h | Throttle queue |
| Governance Exceeded | 1 | Wait until next day | Governance queue |
| Validation Error | 0 | None | Error queue (human review) |
| Auth Error | 3 | 5m (allow token refresh) | Auth error queue |
| Conflict | 0 | None | Conflict queue (human review) |

**DLQ threshold**:
- After max retries, move to dead-letter queue
- Alert on-call
- Human review required

### O7. Concurrency and Retry Amplification

**Problem**: High retry rate can amplify load:
- 10 workers, each retrying failed requests
- Exponential backoff not coordinated
- Thundering herd on NetSuite after backoff expires

**Solution**:
- Centralized retry scheduler in SuiteX
- Exponential backoff with jitter
- Circuit breaker: If error rate > 50%, pause all writes for 5 minutes

---

## Appendix P — NetSuite-Specific Risks & SuiteX Requirements Gap Analysis

### P1. Identified Risks from NetSuite Constraints

| Risk | NetSuite Constraint | Impact on Sync | Mitigation |
|------|---------------------|----------------|------------|
| **UE Event Loss** | UE scripts can fail silently (afterSubmit errors logged but not surfaced) | SuiteX never receives event | Polling as authoritative fallback |
| **Event Duplication** | UE + Polling may emit duplicate events | Duplicate writes to SuiteX | Idempotency keys + deduplication |
| **Out-of-Order Events** | No ordering guarantee for UE events | Incorrect final state | Pub/Sub ordering keys + merge coalescing |
| **Governance Throttling** | Hard limits on API calls, unpredictable | Sync latency spikes | Adaptive throttling + queue management |
| **Field Type Inconsistencies** | Checkbox, multiselect, date formats vary | Comparison failures | Normalization layer (Appendix K) |
| **Partial Batch Failures** | RESTlet may timeout mid-batch | Partial apply, unclear state | Individual writes for critical records |
| **Deleted Record Blind Spots** | Deleted records don't trigger UE; `getDeleted` is SOAP-only | SuiteX doesn't know record deleted | Nightly reconciliation + `getDeleted` polling |
| **Concurrent Write Conflicts** | No built-in optimistic concurrency | Last write wins, data loss | Three-way merge + conflict detection |
| **Sublist Line Ordering** | Line order may change without field change | False positives for change detection | Normalize line order or use line keys |
| **Custom Field Permissions** | Script may not have access to all fields user can see | Incomplete event payloads | Validate field access in UE; log missing fields |

### P2. Missing Requirements from SuiteX (V1 → V2 Gap Analysis)

Based on NetSuite constraints, the following requirements were implicit but not explicitly stated in V1:

#### P2.1. Idempotency Key Schema

**Gap**: V1 mentions idempotency but doesn't define key structure.

**Requirement**:
- Idempotency keys must be deterministic and globally unique
- Format: `<source>:<recordType>:<recordId>:<eventTimestamp>:<operation>`
- Example: `netsuite-ue:customer:12345:2025-11-14T10:30:15Z:update`
- SuiteX must persist idempotency keys in `idempotency_keys` table for 30 days minimum

#### P2.2. Field-Level Metadata Schema

**Gap**: V1 assumes field values are directly comparable; NetSuite field types require normalization.

**Requirement**:
- SuiteX must maintain field metadata table:
  ```sql
  CREATE TABLE field_metadata (
      record_type text,
      field_id text,
      field_type text,  -- NetSuite field type
      is_synced boolean,
      is_readonly boolean,
      normalization_rule text,  -- JSON config
      conflict_policy text,  -- 'netsuite-wins', 'suitex-wins', 'manual'
      PRIMARY KEY (record_type, field_id)
  );
  ```
- SuiteX merge logic must apply `normalization_rule` before comparison

#### P2.3. Polling Watermark Persistence

**Gap**: V1 mentions cursor persistence but not schema or recovery semantics.

**Requirement**:
- Watermark table must be transactional with polling logic:
  ```sql
  CREATE TABLE sync_watermark (
      record_type text PRIMARY KEY,
      last_polled_timestamp timestamptz,
      last_polled_internal_id text,
      last_successful_poll timestamptz,
      consecutive_failures int DEFAULT 0
  );
  ```
- If polling fails, **do not update watermark**; retry from last successful position
- If `consecutive_failures > 5`, alert and increase polling interval

#### P2.4. UE Script Deployment Configuration

**Gap**: V1 assumes UE events will be emitted; doesn't address deployment constraints.

**Requirement**:
- SuiteX must provide deployment checklist for NetSuite admins:
  - [ ] UE script deployed for each record type
  - [ ] All execution contexts enabled (except SuiteX's own sync script ID)
  - [ ] RESTlet endpoint deployed and accessible
  - [ ] Custom record for event queue created (if not using direct Pub/Sub)
  - [ ] Test event emission in sandbox before production
- SuiteX monitoring must alert if UE events drop to zero for > 15 minutes (assuming normal activity)

#### P2.5. Sublist Change Detection Strategy

**Gap**: V1 doesn't specify how to handle sublist/line-level changes.

**Requirement**:
- SuiteX must support two modes (configurable per record type):
  1. **Full sublist snapshot**: Emit entire sublist on any line change
  2. **Line-level deltas**: Emit `{operation: 'add'|'update'|'delete', line: {...}}`
- Default: Full sublist snapshot (simpler, governance-acceptable for most records)
- Line-level deltas: For high-line-count records (e.g., >100 lines per transaction)

#### P2.6. Deleted Record Reconciliation

**Gap**: V1 mentions `getDeleted` but not frequency or fallback.

**Requirement**:
- SuiteX must poll `getDeleted` SOAP API every 15 minutes
- Nightly reconciliation job (see V1) must also detect orphaned records
- Deleted record events must be processed synchronously (do not queue) to avoid foreign key violations

#### P2.7. Governance Budget Monitoring

**Gap**: V1 mentions adaptive throttling but not how to detect budget exhaustion.

**Requirement**:
- SuiteX cannot query NetSuite governance usage directly (no API)
- Must infer from error rates:
  - Track `SSS_REQUEST_LIMIT_EXCEEDED` and `429` rates per account
  - If rate > 10% of requests, reduce concurrency by 50%
  - If rate > 50%, pause writes for 5 minutes (circuit breaker)
- SuiteX dashboard must show:
  - Estimated API calls/hour per account
  - 429 rate (last hour, last day)
  - Current concurrency limit per account
  - Backlog size and estimated time to clear

#### P2.8. Field-Level Conflict Policy Configuration

**Gap**: V1 mentions per-field conflict policies but not how to configure.

**Requirement**:
- SuiteX admin UI must allow per-field conflict policy:
  - **NetSuite Wins**: Always accept NetSuite value (e.g., `lastmodifieddate`, system fields)
  - **SuiteX Wins**: Always accept SuiteX value (e.g., SuiteX-managed custom fields)
  - **Last Write Wins**: Use timestamp (risky; document)
  - **Manual Review**: Queue conflict for human (default for critical fields)
- Policies stored in `field_metadata` table (see P2.2)

#### P2.9. Event Schema Versioning

**Gap**: V1 defines event schema but not versioning strategy.

**Requirement**:
- All events must include `schemaVersion` field
- SuiteX merge service must support multiple schema versions simultaneously (for migration)
- When schema changes:
  - Increment version (e.g., `v1` → `v2`)
  - Deploy consumer supporting both versions
  - Migrate producers gradually
  - Deprecate old version after 30 days
- Example:
  ```json
  {
    "schemaVersion": "v2",
    "recordType": "customer",
    ...
  }
  ```

#### P2.10. NetSuite Sandbox vs Production Isolation

**Gap**: V1 doesn't address multi-environment strategy.

**Requirement**:
- SuiteX must maintain separate event streams per NetSuite environment:
  - `events.raw.production`
  - `events.raw.sandbox`
- Prevent cross-environment pollution (sandbox events must not update production SuiteX)
- SuiteX config must map NetSuite account ID to environment:
  ```json
  {
    "accounts": {
      "1234567": {"environment": "production", "tier": "premium"},
      "1234567_SB1": {"environment": "sandbox", "tier": "standard"}
    }
  }
  ```

---

## Appendix Q — NetSuite-SuiteX Integration Checklist

### Q1. Pre-Implementation Checklist

#### NetSuite Environment
- [ ] NetSuite account tier confirmed (Standard/Premium/Premium Plus)
- [ ] API governance limits documented for account
- [ ] Custom fields created for sync metadata:
  - [ ] `custbody_suitex_update_id` (Body-level)
  - [ ] `custcol_suitex_update_id` (Column-level)
  - [ ] `custentity_suitex_update_id` (Entity-level)
  - [ ] `custitem_suitex_update_id` (Item-level)
  - [ ] `custevent_suitex_update_id` (CRM-level)
- [ ] Custom record type for event queue created (if needed)
- [ ] RESTlet script uploaded and deployed
- [ ] UE scripts uploaded and deployed for target record types
- [ ] SOAP/REST API credentials provisioned (Token-based auth)
- [ ] Test account created with restricted permissions for testing

#### SuiteX Environment
- [ ] Pub/Sub topics created (`events.raw`, `events.merged`, `events.error`, `events.dlq`)
- [ ] Cloud SQL tables created:
  - [ ] `events`
  - [ ] `current_state`
  - [ ] `conflicts`
  - [ ] `event_audit_log`
  - [ ] `sync_watermark`
  - [ ] `field_metadata`
  - [ ] `idempotency_keys`
  - [ ] `record_lock`
- [ ] SuiteX merge service deployed
- [ ] NetSuite writer consumer deployed
- [ ] SuiteX writer consumer deployed
- [ ] Polling service deployed (Scheduled Script or SuiteX cron)
- [ ] Conflict UI deployed
- [ ] Monitoring dashboards configured (Grafana/GCP Monitoring)
- [ ] Alerts configured for key metrics

### Q2. Testing Checklist

#### Unit Tests
- [ ] Event normalization (all field types)
- [ ] Three-way merge logic (no conflict, conflict, auto-resolve)
- [ ] Idempotency key generation and lookup
- [ ] Watermark cursor advancement
- [ ] Error classification (transient vs permanent)

#### Integration Tests
- [ ] UE event emission (create, update, delete)
- [ ] Polling discovery and detail fetch
- [ ] SuiteX → NetSuite write (via RESTlet)
- [ ] NetSuite → SuiteX write (via merge service)
- [ ] Conflict queue population
- [ ] Human resolution flow

#### End-to-End Tests
- [ ] Concurrent update from both systems (conflict detected and resolved)
- [ ] UE event + Polling event deduplication
- [ ] Deleted record detection (`getDeleted` + reconciliation)
- [ ] Governance throttling (simulate 429 responses)
- [ ] Retry behavior (transient error recovery)
- [ ] Backlog processing (queue 1000 events, verify drain)

### Q3. Deployment Checklist

#### Canary Deployment
- [ ] Select low-risk account/record type (e.g., 1 custom record with <100 records)
- [ ] Enable event emission for canary
- [ ] Monitor for 48 hours:
  - [ ] Event rate matches expected
  - [ ] No duplicate writes
  - [ ] Sync latency < 60s (p95)
  - [ ] No 429 errors
  - [ ] Conflict rate < 1%
- [ ] Compare `current_state` projection with NetSuite saved search (100% match)

#### Gradual Rollout
- [ ] Week 1: 1 account, 1 record type
- [ ] Week 2: 1 account, 3 record types (Customer, Item, Sales Order)
- [ ] Week 3: 3 accounts, 3 record types
- [ ] Week 4: 10 accounts, all record types
- [ ] Week 5+: All accounts

#### Rollback Plan
- [ ] Disable UE scripts (stop new events)
- [ ] Pause SuiteX merge service
- [ ] Preserve event queue for replay
- [ ] Document rollback steps in runbook

---

*End of design document.*

## Change Log – V1

- **Version tag**: Updated the document metadata to `Version: 1` while preserving the original title and goals.
- **High-level architecture**: Expanded the SuiteX-specific change capture and snapshot fallback bullets, and clarified that SuiteX uses Pub/Sub plus Cloud SQL for durable event storage while keeping NetSuite producers and constraints unchanged.
- **SuiteX outbound & idempotency**: Elaborated the `SuiteX outbound` section to describe a transactional outbox pattern, SuiteX payload compatibility, and concrete idempotency primitives (idempotency key tracking, audit logs, and projection-aware writes) without altering any NetSuite behavior.
- **Workflow engine & sync pipeline**: Added a new section `SuiteX workflow engine & sync pipeline (implementation view)` detailing SuiteX's event ingestion, normalization, merge, conflict handling, and bidirectional sync pipelines, explicitly mapping them onto SuiteX's multi-tenant Cloud SQL, Redis, Pub/Sub, and worker processes.
- **Conflict handling & migration narrative**: Clarified SuiteX's responsibility for conflict UI, record locking, and coexistence with the existing snapshot pipeline during migration, ensuring the overall pipeline remains technically implementable inside SuiteX and fully compatible with all existing NetSuite-specific sections and constraints.

---

## Change Log – V2

**Version:** 2
**Date:** 2025-11-14
**Objective:** Add comprehensive NetSuite-specific implementation details while preserving all SuiteX content from V1.

### Summary of V2 Changes

V2 builds upon the SuiteX-focused architecture in V1 by adding detailed NetSuite implementation specifications, constraints, and operational guidance. All V1 SuiteX content remains unchanged. The additions provide the technical depth required to implement the NetSuite-side components (UE scripts, RESTlets, polling services) and to ensure SuiteX properly handles NetSuite's unique constraints.

### Major Additions

#### 1. **Appendix J — NetSuite User Event (UE) Script Implementation Details** (NEW)

**What was added:**
- Complete UE trigger semantics (beforeLoad, beforeSubmit, afterSubmit) with sync recommendations
- Comprehensive table of UE context types (CREATE, EDIT, DELETE, XEDIT, APPROVE, etc.) with sync relevance
- Production-ready code examples for:
  - Accessing changed fields in beforeSubmit
  - Emitting events in afterSubmit
  - Error handling and fallback to polling
- UE deployment constraints and execution context filtering guidance
- Detailed governance limits table with critical HTTP request constraints
- Idempotency and race condition mitigation strategies using custom flag fields

**Why this matters:**
- V1 mentioned UE events but didn't specify how to capture field deltas or handle NetSuite's governance restrictions
- UE scripts have only 10 HTTP requests available; this appendix provides three implementation options
- Explains how to prevent infinite loops when SuiteX writes back to NetSuite

**SuiteX requirements identified:**
- SuiteX must set `custbody_suitex_update_id` flag on writes to prevent UE re-emission
- Polling must be authoritative fallback when UE fails silently

#### 2. **Appendix K — NetSuite Field-Level Considerations** (NEW)

**What was added:**
- Comprehensive field type taxonomy table covering 20+ NetSuite field types
- Detailed handling for complex field types:
  - **Multiselect**: Order normalization, empty value handling
  - **Checkbox**: Inconsistent representation across APIs (true/false vs T/F)
  - **Date/Datetime**: Timezone normalization to ISO8601 UTC
  - **Record references**: Circular reference handling
  - **Sublist/line-level fields**: Change detection strategies
- Code examples for normalization functions
- Body vs line field distinctions and custom field namespaces
- Formula/computed field exclusion from sync
- Field-level access control considerations

**Why this matters:**
- V1 assumed field values are directly comparable; NetSuite field types require normalization
- Without normalization, false positives (e.g., multiselect order change) and false negatives (e.g., checkbox T/F vs true/false) will cause sync failures
- Explains why SuiteX needs a `field_metadata` table with normalization rules

**SuiteX requirements identified:**
- SuiteX must maintain `field_metadata` table with `field_type` and `normalization_rule` columns
- SuiteX merge logic must apply normalization before comparison
- Sublist sync strategy must be configurable (full snapshot vs line-level deltas)

#### 3. **Appendix L — NetSuite Governance Limits & Operational Constraints** (NEW)

**What was added:**
- Detailed API request limits table by NetSuite tier (Standard/Premium/Premium Plus)
- Concurrent request limits: 1-10 concurrent, 1k-10k requests/hour depending on tier
- Script execution time limits table for all script types
- Search result pagination constraints (1,000 results default, cursor-based pagination required)
- RESTlet-specific constraints: 10 MB request/response size, 5-minute timeout
- SOAP vs RESTlet vs REST API comparison with recommendations
- Transaction throttling indicators and circuit breaker guidance
- CSV import governance notes

**Why this matters:**
- V1 mentioned governance abstractly; this provides concrete numbers for capacity planning
- SuiteX must never exceed 50% of concurrent limit sustained to avoid 429 storms
- Polling implementation must use Map/Reduce or multiple scheduled script deployments for high-frequency sync

**SuiteX requirements identified:**
- SuiteX must track and enforce per-account concurrency limits based on NetSuite tier
- SuiteX must implement circuit breaker when error rate > 50%
- Use SOAP `getList` for polling (more efficient than individual record loads)

#### 4. **Appendix M — NetSuite Polling Implementation Details** (NEW)

**What was added:**
- Two-phase polling strategy (discovery + detail fetch) with code examples
- Production-ready cursor-based pagination logic using `lastmodifieddate` + `internalid`
- Watermark persistence schema and cursor advancement logic
- Recommended polling cadence table by record type (2-15 minute intervals)
- Dynamic polling adjustment based on result volume
- Deleted record detection using SOAP `getDeleted` API with XML example
- Polling deduplication logic using event fingerprints
- Governance budget estimation (1,100 units per 100 records)

**Why this matters:**
- V1 mentioned polling but didn't specify how to implement cursor-based pagination to avoid missing records
- NetSuite's 1,000-result search limit requires careful pagination logic
- Deleted records don't update `lastmodifieddate`, so special handling is required

**SuiteX requirements identified:**
- SuiteX must persist `sync_watermark` with `last_polled_timestamp` and `last_polled_internal_id`
- SuiteX must poll `getDeleted` SOAP API every 15 minutes
- SuiteX must track event fingerprints in `event_audit_log` to deduplicate UE + polling events
- If polling fails, watermark must NOT advance (prevent data loss)

#### 5. **Appendix N — NetSuite Event Ordering Constraints & Guarantees** (NEW)

**What was added:**
- What NetSuite guarantees (atomic transactions, second-precision timestamps)
- What NetSuite does NOT guarantee (UE script ordering, sub-second resolution, cross-record ordering)
- Detailed scenario analysis:
  - UE event ordering challenges (two edits within 1 second)
  - Polling ordering guarantees (sorted by lastmodifieddate + internalid)
  - Cross-record ordering problems (Sales Order created before Customer sync)
  - SuiteX → NetSuite write ordering (concurrent writes to same record)
- Mitigation strategies for each scenario
- Ordering guarantees summary table by layer

**Why this matters:**
- V1 mentioned ordering keys but didn't explain NetSuite's lack of ordering guarantees
- Without Pub/Sub ordering keys, events may arrive out of order and corrupt final state
- Cross-record dependencies (foreign keys) can cause sync failures if not handled

**SuiteX requirements identified:**
- SuiteX must use Pub/Sub ordering keys (`recordType:recordId`) for per-record ordering
- SuiteX merge service must coalesce events for 5-10 seconds before applying
- SuiteX must implement record-level locks to serialize writes per record
- SuiteX must implement dependency graph and retry queue for foreign key failures

#### 6. **Appendix O — NetSuite Retry Behavior & Limitations** (NEW)

**What was added:**
- RESTlet retry semantics: not idempotent by default, code example for idempotency key checking
- SOAP retry behavior with exponential backoff code example
- Comprehensive error classification:
  - **Transient errors**: 429, 500, 503, timeouts (retry)
  - **Permanent errors**: 400, 401, 403, 404, INSUFFICIENT_PERMISSION (do not retry)
  - **Ambiguous errors**: 504, SSS_TIME_LIMIT_EXCEEDED (retry with caution)
- Governance-induced failures: SSS_USAGE_LIMIT_EXCEEDED (wait until next day)
- Partial success handling strategies (idempotent records, checkpoint-based, individual updates)
- SuiteX retry policy table by error class with max retries, backoff schedules, and queue assignments
- Retry amplification problem and centralized retry scheduler solution

**Why this matters:**
- V1 mentioned retries abstractly; this provides concrete retry policies and backoff schedules
- NetSuite RESTlets are not idempotent, so duplicate retries can create duplicate records
- Governance exhaustion requires special handling (queue for next day)

**SuiteX requirements identified:**
- SuiteX RESTlets must implement idempotency key storage in NetSuite custom records
- SuiteX must classify errors into transient/permanent/ambiguous categories
- SuiteX must implement separate queues: retry, throttle, governance, error, auth error, conflict
- SuiteX must implement centralized retry scheduler with jitter to prevent thundering herd

#### 7. **Appendix P — NetSuite-Specific Risks & SuiteX Requirements Gap Analysis** (NEW)

**What was added:**
- Comprehensive risk table with 10 identified risks:
  - UE event loss, event duplication, out-of-order events
  - Governance throttling, field type inconsistencies, partial batch failures
  - Deleted record blind spots, concurrent write conflicts, sublist line ordering
  - Custom field permissions
- For each risk: NetSuite constraint, impact on sync, mitigation strategy
- **Critical section**: Missing Requirements from SuiteX (V1 → V2 Gap Analysis)
  - **P2.1**: Idempotency key schema (format specification)
  - **P2.2**: Field-level metadata schema (SQL DDL for `field_metadata` table)
  - **P2.3**: Polling watermark persistence (SQL DDL with failure recovery semantics)
  - **P2.4**: UE script deployment configuration checklist
  - **P2.5**: Sublist change detection strategy (two modes)
  - **P2.6**: Deleted record reconciliation (15-minute `getDeleted` polling)
  - **P2.7**: Governance budget monitoring (dashboard requirements)
  - **P2.8**: Field-level conflict policy configuration (four policy types)
  - **P2.9**: Event schema versioning (migration strategy)
  - **P2.10**: NetSuite sandbox vs production isolation (separate event streams)

**Why this matters:**
- This is the most critical addition: it explicitly identifies what was implicit in V1
- Provides concrete SuiteX implementation requirements derived from NetSuite constraints
- Each gap includes SQL schemas, configuration examples, or policy specifications
- Prevents mismatches between SuiteX implementation and NetSuite capabilities

**SuiteX requirements identified:**
- 10 specific gaps with concrete requirements, schemas, and configurations
- These are actionable requirements for SuiteX implementation teams

#### 8. **Appendix Q — NetSuite-SuiteX Integration Checklist** (NEW)

**What was added:**
- **Q1 Pre-Implementation Checklist**: Two sections
  - NetSuite Environment: 9 checkboxes (account tier, custom fields, scripts, credentials)
  - SuiteX Environment: 14 checkboxes (Pub/Sub topics, Cloud SQL tables, services, UI, monitoring)
- **Q2 Testing Checklist**: Three categories
  - Unit tests: 5 items (normalization, merge, idempotency, watermark, error classification)
  - Integration tests: 6 items (UE events, polling, bidirectional writes, conflict queue)
  - End-to-end tests: 6 items (concurrency, deduplication, deletes, throttling, retries, backlog)
- **Q3 Deployment Checklist**:
  - Canary deployment: 6 validation criteria (event rate, latency, errors, conflicts, data parity)
  - Gradual rollout: 5-week phased plan
  - Rollback plan: 4 steps

**Why this matters:**
- Provides operational readiness framework for implementation teams
- Ensures both NetSuite and SuiteX environments are properly configured before go-live
- Defines acceptance criteria for each deployment stage

### What Was Preserved from V1

- **All SuiteX architecture sections**: Event model, projections, orchestrator, conflict queue, monitoring, operational controls
- **All SuiteX workflow engine details**: Transactional outbox, multi-tenant isolation, normalized events, merge logic
- **All diagrams and schemas**: Mermaid sequence diagrams (B1-B4), ERDs (Appendix D), canonical data models (C1-C3)
- **All pseudocode**: Merge orchestrator (F1), NetSuite writer (F2), human-intervention flow (F3)
- **All SuiteX infrastructure**: Pub/Sub topics, Cloud SQL tables, conflict UI, record locking
- **All V1 appendices**: A (Durable Event Backbone), B (Mermaid Diagrams), C (Data Models), D (ERD), E (Topic Map), F (Pseudocode), G (Error Taxonomy), H (Governance Policies), I (Config Spec)

### Cross-References Added

V2 additions reference V1 content extensively:
- Appendix J references "Appendix H" (Governance Policies from V1)
- Appendix N references "Appendix A4" (Ordering keys from V1)
- Appendix P references V1's conflict detection, projection version, and merge logic
- Appendix Q integrates V1's event schema, topic structure, and table schemas into checklists

### Validation

**No conflicts introduced:**
- NetSuite constraints inform SuiteX requirements but do not alter SuiteX design
- SuiteX remains responsible for orchestration, conflict resolution, and event backbone
- NetSuite remains responsible for UE scripts, RESTlets, and SOAP/REST API access

**Requirements are bidirectional:**
- SuiteX must accommodate NetSuite's field normalization needs (Appendix K → P2.2)
- NetSuite must implement idempotency keys in RESTlets (Appendix O → SuiteX can safely retry)

**Completeness:**
- V2 addresses all four original objectives:
  1. ✅ UE trigger semantics (Appendix J)
  2. ✅ Field-level considerations (Appendix K)
  3. ✅ Governance limits (Appendix L)
  4. ✅ REST/SOAP API notes (Appendix L6-L7)
  5. ✅ Polling cadence and risks (Appendix M)
  6. ✅ Event ordering constraints (Appendix N)
  7. ✅ Retry behavior limitations (Appendix O)

**Gap analysis included:**
- Appendix P explicitly identifies 10 missing requirements from V1
- Each gap includes concrete SuiteX implementation requirement

### Document Status

- **Total length**: ~2,165 lines (original 962 lines + 1,203 new lines)
- **New appendices**: 7 (J, K, L, M, N, O, P, Q)
- **V1 appendices preserved**: 9 (A-I)
- **Change logs**: 2 (V1 preserved, V2 added)
- **SuiteX content changes**: 0 (fully preserved)
- **NetSuite content changes**: 100% new technical detail

### Next Steps for Implementation Teams

1. **Review Appendix P (Gap Analysis)** first — identifies 10 concrete SuiteX requirements
2. **Validate field metadata schema (P2.2)** — ensure SuiteX can store NetSuite field types and normalization rules
3. **Implement idempotency key handling (P2.1, O1)** — both SuiteX and NetSuite RESTlets
4. **Deploy UE scripts per Appendix J** — use provided code examples
5. **Configure polling per Appendix M** — use recommended cadences and cursor logic
6. **Execute Appendix Q checklists** — validate environment readiness before go-live

---

**End of Change Log – V2**

