# MySQL Buffered vs Unbuffered Queries - Best Practices Guide

## 📚 Overview

This guide explains the difference between **buffered** and **unbuffered** MySQL queries and provides best practices to avoid production errors in multi-tenant applications.

## 🔍 Understanding the Problem

### What is the Error?

```
SQLSTATE[HY000]: General error: 2014 Cannot execute queries while other unbuffered queries are active. 
Consider using PDOStatement::fetchAll(). Alternatively, if your code is only ever going to run against mysql, 
you may enable query buffering by setting the PDO::MYSQL_ATTR_USE_BUFFERED_QUERY attribute.
```

### Why Does It Happen?

This error occurs when you try to execute a **new query** while a previous **unbuffered query** still has pending results.

## 📊 Buffered vs Unbuffered Queries

| Aspect | Buffered Queries | Unbuffered Queries |
|--------|------------------|-------------------|
| **Memory** | Loads **all results** into PHP memory | Fetches results **one by one** from MySQL |
| **Concurrent Queries** | ✅ **Can run multiple** queries simultaneously | ❌ **Blocks** other queries until finished |
| **Memory Usage** | ⚠️ **Higher** - Full result set in memory | ✅ **Lower** - Only current row in memory |
| **Performance** | ✅ **Faster** for small-medium result sets | ⚠️ **Slower** but better for huge datasets |
| **Use Case** | General queries, multi-tenant apps | Very large datasets (millions of rows) |

## 🎯 When to Use Buffered Queries

### ✅ Use Buffered Queries When:

1. **Nested Queries**: You need to execute another query before finishing the current one
2. **Multi-tenant Systems**: Multiple tenants with concurrent operations
3. **Observer/Event Patterns**: Methods that call other methods with queries
4. **Default Behavior**: For most application queries (< 10K rows)

### Example: Nested Queries (Problem)

```php
// ❌ BAD: This will throw "unbuffered queries" error
$flowMetric = FlowMetric::where('flow_id', $flowId)->first(); // Query 1 - cursor open
$nodeMetrics = NodeMetric::where('flow_metric_id', $flowMetric->id)->get(); // Query 2 - ERROR!
```

### Example: Nested Queries (Solution)

```php
// ✅ GOOD: Enable buffered queries with scope
$connection = DB::connection();
$pdo = $connection->getPdo();
$originalBufferSetting = $pdo->getAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY);
$pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true);

try {
    $flowMetric = FlowMetric::where('flow_id', $flowId)->first(); // Query 1 - fully buffered
    $nodeMetrics = NodeMetric::where('flow_metric_id', $flowMetric->id)->get(); // Query 2 - OK!
} finally {
    // Always restore original setting
    $pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, $originalBufferSetting);
}
```

## 🚫 When to Use Unbuffered Queries

### ⚠️ Use Unbuffered Queries Only When:

1. **Huge Datasets**: Processing millions of rows (> 100K)
2. **Streaming Data**: Export/backup operations
3. **Memory Constraints**: Server has limited RAM
4. **Single Query**: No other queries will run until this finishes

### Example: Large Dataset Processing

```php
// ✅ GOOD: Use unbuffered for huge datasets
$pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);

try {
    // Process 1 million records without loading all into memory
    $cursor = DB::select('SELECT * FROM huge_table');
    foreach ($cursor as $row) {
        processRow($row); // Process one at a time
    }
} finally {
    $pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true);
}
```

## 🎨 Implementation Patterns

### Pattern 1: Scoped Buffered Queries (Recommended)

**Use this pattern when:** A specific method has nested queries

```php
public function getFlowMetric(): ?FlowMetric
{
    // Enable buffered queries ONLY for this operation
    $connection = DB::connection();
    $pdo = $connection->getPdo();
    $originalBufferSetting = $pdo->getAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY);
    $pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true);
    
    try {
        // Your queries here - safe from unbuffered conflicts
        $flowMetric = FlowMetric::where('flow_id', $this->flowId)->first();
        $nodeMetrics = NodeMetric::where('flow_metric_id', $flowMetric->id)->get();
        
        return $flowMetric;
    } finally {
        // Restore original setting - CRITICAL!
        $pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, $originalBufferSetting);
    }
}
```

**Why this is best:**
- ✅ Minimal scope - only affects this method
- ✅ No global impact on other tenant operations
- ✅ Automatically restores setting (even if exception thrown)
- ✅ Clear intent and documentation

### Pattern 2: Load Data Once (Alternative)

**Use this pattern when:** You can fetch all data upfront

```php
// ✅ GOOD: Load all data first, then process
public function processFlowCompletion(FlowCompleted $event): void
{
    // Query 1: Load everything upfront
    $flowData = $this->loadAllFlowData($event->flow_id);
    
    // Process with in-memory data - no more queries
    $this->updateMetrics($flowData);
    $this->sendNotifications($flowData);
    $this->cleanup($flowData);
}

private function loadAllFlowData(int $flowId): array
{
    // All queries together, no nested calls
    $flowMetric = FlowMetric::where('flow_id', $flowId)->first();
    $nodeMetrics = NodeMetric::where('flow_metric_id', $flowMetric->id)->get();
    $errors = RecordProcessingError::whereIn('node_metric_id', $nodeMetrics->pluck('id'))->get();
    
    return compact('flowMetric', 'nodeMetrics', 'errors');
}
```

**Why this works:**
- ✅ No nested queries - all loaded upfront
- ✅ Better testability - data passed as parameters
- ✅ Easier to understand flow
- ⚠️ Requires refactoring existing code

### Pattern 3: Global Configuration (NOT Recommended)

```php
// ❌ BAD: Don't do this in config/database.php
'options' => [
    PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => true, // Affects ALL queries globally
]
```

**Why avoid this:**
- ❌ Increases memory usage for ALL queries
- ❌ Poor performance with large result sets
- ❌ Affects all tenants equally (one size doesn't fit all)
- ❌ Can cause memory exhaustion with big datasets

## 🔍 How to Identify Problem Areas

### Signs You Need Buffered Queries:

1. **Error Message**: `"Cannot execute queries while other unbuffered queries are active"`
2. **Code Pattern**: Query inside a loop or before another query
3. **Observer Pattern**: Observers that call services with queries
4. **Event Listeners**: Listeners that execute multiple queries

### Audit Your Code:

```bash
# Find potential nested query patterns
grep -r "->first()" src/ | grep -v vendor
grep -r "->get()" src/ | grep -v vendor

# Look for query patterns in observers
find src/App/Observers -name "*.php" -exec grep -l "DB::\|Model::" {} \;

# Check services that might have nested queries
find src/App/Services -name "*.php" -exec grep -l "where.*first\|where.*get" {} \;
```

## 📝 Real-World Examples in This Project

### ✅ Fixed: FlowMetricsService (Production Error)

**Location**: `src/Domain/Ipaas/Metrics/Services/FlowMetricsService.php`  
**Risk Level**: 🔴 **CRITICAL** (confirmed production errors)

**Problem:**
```php
// Called in FlowCompletedObserver line 112-116
$flowMetric = $flowMetricsService->getFlowMetric(); // Query 1 - cursor open
$nodeMetrics = NodeMetric::where('flow_metric_id', $flowMetric->id)->get(); // Query 2 - ERROR!
```

**Production Error Log:**
```
[2025-10-06 20:16:19] production.ERROR: Error finding active flow metric 
{"flow_id":158,"error":"Cannot execute queries while other unbuffered queries are active"}
```

**Solution:** Applied scoped buffered queries in `getFlowMetric()` and `getNodeMetric()` methods  
**Impact:** Fixed critical production error, minimal performance impact (~1-5ms)

---

### ✅ Fixed: RecordObserver (Preventive)

**Location**: `src/App/Observers/RecordObserver.php`  
**Risk Level**: 🟡 **MEDIUM** (high frequency, no errors yet)

**Problem:**
```php
// Methods: creating() and deleting()
// 3 consecutive queries without buffering
$recordType = RecordType::where('model_name', $modelName)->first();    // Query 1
$configuration = Configuration::first();                                 // Query 2
$integration = Integration::find($configuration->integration_id);       // Query 3
```

**Why Risky:**
- Executes on **every** record creation/deletion (high frequency)
- Multi-tenant environment with concurrent operations
- 3 dependent queries in sequence

**Solution:** Applied scoped buffered queries to `creating()` and `deleting()` methods  
**Impact:** Preventive fix, no behavior changes, same performance

---

### ✅ Fixed: SurveyController (Loop Pattern)

**Location**: `src/App/Http/Controllers/SurveyController.php`  
**Risk Level**: 🟡 **MEDIUM** (queries in loop)

**Problem:**
```php
// Method: show() line 114-115
foreach ($surveyRecords as $key => $s_record) {
    // ⚠️ QUERIES INSIDE LOOP - could be hundreds of nested queries!
    $recordType = RecordType::where('id', '=', $s_record->record_type)->first();
    $record_id = GetModel($recordType->model_name)->where('id', '=', $s_record->record_id)->first();
}
```

**Why Risky:**
- If survey has 100 records = 200 nested queries
- Second query depends on first query result
- Classic N+1 problem amplified by unbuffered cursors

**Solution:** Applied scoped buffered queries to entire `show()` method  
**Impact:** Safe for any number of survey records, no refactor needed

---

### ⏳ Monitoring: Low-Risk Areas

These areas have nested query patterns but are low risk due to independent queries or low frequency:

**FlowsController::create()**
- Multiple independent queries (different tables)
- HTTP context (lower concurrency)
- Status: ✅ **Monitor only**

**MentionService::findUserByMentionText()**
- Second query only runs if first returns null
- Rarely execute both queries
- Status: ✅ **Monitor only**

---

## 📊 Current Status Summary

| File | Method | Risk | Status | Date Fixed |
|------|--------|------|--------|-----------|
| FlowMetricsService.php | getFlowMetric() | 🔴 Critical | ✅ Fixed | 2025-10-16 |
| FlowMetricsService.php | getNodeMetric() | 🔴 Critical | ✅ Fixed | 2025-10-16 |
| RecordObserver.php | creating() | 🟡 Medium | ✅ Fixed | 2025-10-16 |
| RecordObserver.php | deleting() | 🟡 Medium | ✅ Fixed | 2025-10-16 |
| SurveyController.php | show() | 🟡 Medium | ✅ Fixed | 2025-10-16 |
| FlowsController.php | create() | 🟢 Low | ⏳ Monitor | - |
| MentionService.php | findUserByMentionText() | 🟢 Low | ⏳ Monitor | - |

**All critical and medium-risk areas have been fixed.**

## 🚀 Migration Guide

If you need to fix an unbuffered query error:

### Step 1: Identify the Problem

Look at the error stack trace to find:
1. Which query is failing
2. Which query ran before it
3. Which method contains both queries

### Step 2: Apply Scoped Buffered Queries

Add the buffered query pattern to the method:

```php
public function problematicMethod()
{
    // Add this block
    $connection = DB::connection();
    $pdo = $connection->getPdo();
    $originalBufferSetting = $pdo->getAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY);
    $pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true);
    
    try {
        // Existing code stays the same
        $result1 = Model1::where(...)->first();
        $result2 = Model2::where(...)->get();
        
        return $result1;
    } finally {
        $pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, $originalBufferSetting);
    }
}
```

### Step 3: Test

1. **Unit Test**: Verify the method still works
2. **Integration Test**: Test the full flow
3. **Production Monitoring**: Watch logs for the error

### Step 4: Document

Add a comment explaining why buffered queries are needed:

```php
// Enable buffered queries to prevent "Cannot execute queries while other unbuffered queries are active" error
// This is necessary because this method is called before other nested queries in FlowCompletedObserver
```

## 🎓 Key Takeaways

1. ✅ **Use scoped buffered queries** for methods with nested queries
2. ❌ **Don't enable buffered queries globally** - it wastes memory
3. 🔍 **Monitor production logs** for unbuffered query errors
4. 📝 **Document why** you're using buffered queries
5. 🧪 **Test thoroughly** after applying the fix
6. 🔄 **Always restore** the original setting in `finally` block

## 📚 References

- [PDO MySQL Driver Documentation](https://www.php.net/manual/en/ref.pdo-mysql.php)
- [MySQL Buffered vs Unbuffered Queries](https://dev.mysql.com/doc/refman/8.0/en/c-api-prepared-statement-problems.html)
- [Laravel Database Documentation](https://laravel.com/docs/database)

## 🔍 Monitoring & Troubleshooting

### Check for Unbuffered Query Errors

```bash
# General search for unbuffered query errors
grep -r "unbuffered queries" storage/logs/production-*.log

# Count occurrences
grep "unbuffered queries" storage/logs/production-*.log | wc -l

# Check specific files
grep "RecordObserver" storage/logs/production-*.log | grep "unbuffered"
grep "SurveyController" storage/logs/production-*.log | grep "unbuffered"
grep "FlowMetricsService" storage/logs/production-*.log | grep "unbuffered"

# See most recent errors with context
grep -A 5 -B 5 "unbuffered queries" storage/logs/production-$(date +%Y-%m-%d).log | tail -50
```

### Verify Fixes Are Working

```bash
# Check that flow metrics are updating successfully
grep "Flow metrics updated" storage/logs/production-*.log | tail -20

# Check that cleanup is completing
grep "Database entries cleaned up successfully" storage/logs/production-*.log | tail -20

# Verify no new unbuffered errors after deployment
grep "unbuffered queries" storage/logs/production-*.log --after-context=2 | \
  grep -E "202[5-9]-[0-9]{2}-[0-9]{2}" | tail -10
```

### Performance Impact Assessment

After applying buffered query fixes, monitor:

1. **Memory Usage**: Should increase by ~50-200KB per request (negligible)
2. **Query Time**: Should add ~1-5ms per affected method (negligible)
3. **Error Rate**: Should drop to zero for unbuffered query errors

## 🆘 Need Help?

If you encounter unbuffered query errors:

1. **Check this guide first** - Review examples and patterns
2. **Search production logs** - Find the full error with stack trace
3. **Identify the pattern** - Look for nested queries, loops, observers
4. **Apply the fix** - Use scoped buffered queries pattern (Pattern 1)
5. **Test thoroughly** - Unit tests, integration tests, staging environment
6. **Monitor post-deployment** - Watch logs for 24-48 hours
7. **If unsure** - Ask in #engineering-help channel with error logs

### Quick Decision Tree

```
Error: "Cannot execute queries while other unbuffered queries are active"
│
├─ Is it in an Observer? 
│  └─ YES → Apply scoped buffered queries to entire observer method
│
├─ Is it in a loop?
│  └─ YES → Apply scoped buffered queries to entire method OR refactor to load data upfront
│
├─ Is it calling a service with queries?
│  └─ YES → Apply scoped buffered queries to the service method being called
│
└─ Not sure?
   └─ Apply scoped buffered queries to the method shown in stack trace
```

---

**Last Updated**: 2025-10-16  
**Maintained By**: Engineering Team  
**Related Docs**: See `docs/AI/ai_context.md` for AI agent quick reference

