## Import Fixes: Wave Coordination + Parent-First Transaction Lines (Derived Waves)

### Purpose

Document concrete fixes to ensure derived TransactionLine waves reliably initialize and dispatch immediately after parent (e.g., Sales Orders) completion, without being blocked by premature job completion or cache brittleness.

### Background

- Parent-first strategy: `TransactionLine` (`record_type_id = -13`) must process only after parent transactions complete.
- Derived waves are created by a listener after parents reach the completion threshold and then dispatched by the `WaveCoordinator`.
- Production observations: After the final parent wave completes, derived waves/batches sometimes never dispatch; job monitoring appears to shut down.

### Root Cause Analysis

1) Listener race on last parent completion
- In `BatchJobCompletedListener`, derived initialization checks parent completion percentage BEFORE the DB transaction that increments the final parent `completed_batches`. On the last batch, this may evaluate below the threshold (100%), skipping derived creation.
- Immediately afterward, the coordinator may mark the job complete and clear monitoring, preventing a second chance to initialize/dispatch derived waves.

2) Coordinator premature completion and monitoring shutdown
- In `WaveCoordinator->checkAndTriggerNextWave`, when the final main-type wave completes, the coordinator marks the job "completed" and clears `wave_monitoring_active_{jobId}` without verifying whether derived `-13` batches are pending.
- With monitoring disabled, the listener explicitly skips progression checks, so no further dispatch is attempted even if derived batches exist.

3) Monitoring/caching gating blocks progression
- `BatchJobCompletedListener->checkAndTriggerNextWave` early-returns when `wave_monitoring_active_{jobId}` is false. If the coordinator cleared this, derived progression stops.
- Although the coordinator supports DB-driven discovery of derived waves (dependency_level = -2) and a DB fallback for the `derived_lines_ready_{jobId}` flag, it will not be executed if it is never nudged again after monitoring is disabled.

### Conflicting/Blocking Logic (Summary)

- Early derived-init gating in listener (computed using stale completion percent on last parent batch).
- Coordinator marks job completed and clears monitoring before checking for pending derived `-13` batches.
- Listener skips progression checks if monitoring is off.

### Design Changes

#### A. Listener Ordering Fix (Post-Commit Derived Initialization)

- After a parent batch completes:
  - Update wave/batch completion inside a DB transaction as it does now.
  - After the transaction commits, recompute the parent completion percent from the DB.
  - If parent completion ≥ `config('waves.completion_threshold', 100)`, then:
    - Call `progressivelyCreateDerivedLineBatches($jobId, $parentIds)` (with DB fallback for parent IDs when cache is empty).
    - Set `Cache::put("derived_lines_ready_{$jobId}", true, 7200)`.
    - Ensure `Cache::put("wave_monitoring_active_{$jobId}", true, 7200)` remains true.
    - Nudge the coordinator via `WaveCoordinator->checkAndTriggerNextWave($jobId)`.

Rationale: Guarantees the last parent completion is visible before gating, so derived waves initialize deterministically on the final parent batch.

Notes:
- Keep progressive creation during parent progress (pending-only derived wave/batch upserts) intact, but do not rely on pre-commit completion checks for final initialization.
- Continue storing chunk ids in `wave_batches.chunk_ids_json` when the column exists; cache remains a fast path.

#### B. Coordinator Completion Guard (Do Not Finish If Derived Pending)

- Before marking the job as completed and clearing `wave_monitoring_active_{jobId}`, the coordinator must first check for any pending/active derived `-13` batches.

Implementation outline in `WaveCoordinator->checkAndTriggerNextWave($jobId)`:
- When main-type waves finish (dependency_level = -1) and the coordinator is about to mark the job complete:
  - Query tenant DB for `wave_batches` rows where `job_id = $jobId`, `record_type_id = -13`, `status IN ('pending','dispatched','processing')`.
  - If any exist:
    - Do NOT mark complete and do NOT clear `wave_monitoring_active_{jobId}`.
    - Attempt the derived progression path (the code already finds `nextDerivedWave` by joining `wave_coordination` with `dependency_level = -2`).
    - Return a status such as `derived_pending` or `derived_wave_triggered`.
  - Else (no derived present): proceed to mark job complete and clear monitoring.

Also apply the same guard to the branch that completes when "no main types exist" so monitoring is not cleared if derived exist.

Rationale: Prevents premature job completion that shuts down progression before derived waves are dispatched.

#### C. Keep Monitoring Alive Until Derived Completion

- Whenever derived waves are created or dispatched, refresh `wave_monitoring_active_{jobId}` TTL.
- The listener already sets the flag when initializing derived; the coordinator should also refresh it when dispatching derived waves or when derived are discovered pending.

Rationale: Ensures the listener’s progression nudges do not get skipped.

#### D. Event Wiring Validation (Immediate Nudge After Creation)

- Ensure `DerivedWavesCreated` is emitted after derived upserts and that `DerivedWavesCreatedListener` is registered in `EventServiceProvider` to immediately call the coordinator’s progression.
- This decouples creation from dispatch and reduces reliance on cache/ticks.

#### E. Derived-Ready Guard Robustness

- Continue using the coordinator’s DB fallback for `derived_lines_ready_{jobId}`:
  - If the cache key is missing but any `-13` `wave_batches` exist for the `job_id`, set the flag and proceed.
- This prevents TTL expiration from blocking dispatch.

### Implementation Details (Per File)

1) `src/App/Listeners/ImportJobs/BatchJobCompletedListener.php`
- In `handle(...)` (parent batch completion path):
  - Keep the completion update transaction as-is.
  - After commit, recompute parent completion percent from DB and then run derived final initialization if threshold reached:
    - `progressivelyCreateDerivedLineBatches($jobId, $parentIdsOrDbFallback)`
    - `Cache::put("derived_lines_ready_{$jobId}", true, 7200)`
    - `Cache::put("wave_monitoring_active_{$jobId}", true, 7200)`
    - `app(WaveCoordinator::class)->checkAndTriggerNextWave($jobId)`
- Leave progressive (pending-only) derived creation during parent progress in place.
- Do not early return progression when derived are pending and monitoring is true.

2) `src/App/Services/ImportJobs/WaveCoordinator.php`
- In the block that marks completion of the final main wave:
  - Add a DB check for pending/active derived `-13` `wave_batches`.
  - If any exist, skip job completion and attempt the derived progression branch. Keep monitoring active.
- In the path where no main type waves exist and the code considers completing:
  - Similarly check for pending/active derived before clearing monitoring.
- When a derived wave is about to dispatch, refresh `wave_monitoring_active_{jobId}` TTL.

3) `src/App/Providers/EventServiceProvider.php`
- Ensure the following mapping is present:
  - `DerivedWavesCreated::class => [DerivedWavesCreatedListener::class]`
- Verify the listener calls `WaveCoordinator->checkAndTriggerNextWave($jobId)`.

4) `config/waves.php`
- Confirm `completion_threshold` is 100 (or desired value). No change required.

### Acceptance Criteria

- Parent-first gating: Derived `TransactionLine` waves are created (idempotently) only after parents reach the completion threshold.
- Derived dispatch reliability: The first derived wave is dispatched immediately after derived creation or on the next coordinator run.
- No premature completion: Coordinator does not mark job complete nor clear `wave_monitoring_active_{jobId}` if any derived `-13` batches are pending or active.
- DB-first truth: Coordinator derived progression works even if cache flags are missing (DB fallback operational).
- Monitoring continuity: `wave_monitoring_active_{jobId}` remains true until all derived waves finish.
- Idempotency: Re-running initialization/dispatch does not create duplicate batches or queue duplicate jobs; `(job_id, batch_id)` unique is enforced.

### Migration/Schema Notes

- `wave_batches`: Unique index on `(job_id, batch_id)` to guarantee idempotent derived batch creation.
- Optional `chunk_ids_json` long text column to persist IN() parent IDs (safer than cache-only). Coordinator/job will prefer DB-stored ids then fall back to cache.
- Migrations should be under tenant scope (`database/migrations/tenants`) and remain reversible.

### Operational Guidance

- If derived dispatch stalls in production:
  - Run `php artisan waves:reconcile {jobId} --tenant=tenant_connection --db={tenant_db}` to recover stale waves (demote stale processing waves, requeue stuck batches) and force a progression check.
  - Verify that `wave_batches` contains `-13` rows with `status IN ('pending','dispatched','processing')`.
  - Confirm `derived_lines_ready_{jobId}` is set or that `-13` rows exist (DB fallback will restore readiness).
  - Ensure monitoring active flag is present: `wave_monitoring_active_{jobId} = true` (reconcile command also refreshes).

### Logging & Monitoring

- Coordinator logs must include:
  - Derived dispatch events: job_id, wave_number, dispatched batch count.
  - Completion guard decisions: whether derived were detected pending when main waves completed.
  - Recovery actions: demotions from `processing` to `pending`, catch-up dispatches.
- Key metrics to watch:
  - `wave.stuck_detection.count`, `queue.active_jobs.count`, `wave.dispatch_lag.seconds`.
  - Derived backlog indicator: count of `-13` `wave_batches` in non-completed states.

### Testing Strategy (No new tests created by default)

- Unit/Integration scenarios to validate (describe for future use):
  - Last-parent-batch completion triggers derived init (post-commit) and dispatch.
  - Coordinator does not mark completion if any derived pending/active exist; instead dispatches derived.
  - Monitoring remains active until derived finish; listener does not skip progression when derived are pending.
  - DB fallback for `derived_lines_ready_{jobId}` works when cache is empty.
  - Idempotent derived wave/batch upsert on retries/races.

### Risk & Mitigation

- Risk: Extra DB checks at completion time could introduce slight overhead.
  - Mitigation: Indexed lookups on `wave_batches` (`job_id`, `record_type_id`, `status`) and short-circuiting logic.
- Risk: Incorrect event wiring prevents immediate nudge.
  - Mitigation: Validate `EventServiceProvider` mapping and add logging when handling `DerivedWavesCreated`.
- Risk: Cache TTL expiration during long runs.
  - Mitigation: DB-first discovery and periodic TTL refresh during progression events and dispatch.

### Summary of Changes to Implement

- Move final derived initialization in `BatchJobCompletedListener` to post-commit, recomputing parent completion from DB.
- Add coordinator completion guards to block premature job completion if derived `-13` batches are pending/active.
- Keep `wave_monitoring_active_{jobId}` true until derived waves are fully completed; refresh TTL on derived creation/dispatch.
- Ensure `DerivedWavesCreated` event is emitted and handled to immediately nudge coordinator progression.

This design ensures derived TransactionLine waves initialize deterministically when parents complete and that the coordinator will not terminate progression until all derived processing is finished.


