# Message Batching Configuration ThirdEye batches messages before processing to improve efficiency and reduce API costs. Here's how to configure it for your needs. --- ## How Batching Works Messages are processed when either condition is met: 1. **Buffer reaches BATCH_SIZE** messages (processes immediately) 2. **Timer reaches BATCH_TIMEOUT_SECONDS** (processes whatever is in buffer) This prevents processing every single message individually while ensuring messages don't wait forever. --- ## Configuration Variables Edit `.env`: ```bash BATCH_SIZE=5 # Process after N messages BATCH_TIMEOUT_SECONDS=60 # Or wait this many seconds ``` ### Recommended Settings **For Production (efficiency priority):** ```bash BATCH_SIZE=5 BATCH_TIMEOUT_SECONDS=60 ``` - Batches 5 messages together (reduces API calls) - Waits up to 60 seconds for more messages - Good for active groups with frequent messages **For Testing/Demo (speed priority):** ```bash BATCH_SIZE=3 BATCH_TIMEOUT_SECONDS=15 ``` - Processes after just 3 messages - Only waits 15 seconds max - Better for testing and demos where you want quick feedback **For Low-Volume Groups:** ```bash BATCH_SIZE=3 BATCH_TIMEOUT_SECONDS=30 ``` - Smaller batch size (triggers faster) - Moderate timeout (30 seconds) **For High-Volume Groups:** ```bash BATCH_SIZE=10 BATCH_TIMEOUT_SECONDS=90 ``` - Larger batches (more efficient) - Longer timeout (less frequent processing) --- ## Manual Flush Command If you don't want to wait for the timer, use: ``` /flush ``` This immediately processes all buffered messages for the current group. **When to use /flush:** - During testing/demos (want instant results) - After important messages (need to query them right away) - When buffer has accumulated but timer hasn't triggered yet --- ## Example Scenarios ### Scenario 1: Active Group Chat (10 messages in 30 seconds) With `BATCH_SIZE=5, BATCH_TIMEOUT_SECONDS=60`: - First 5 messages → processed immediately - Next 5 messages → processed immediately - Total: 2 batch operations ### Scenario 2: Slow Group Chat (1 message, then silence) With `BATCH_SIZE=5, BATCH_TIMEOUT_SECONDS=60`: - 1 message arrives → starts 60s timer - 60 seconds pass → timer triggers → processes the 1 message - Total: 1 batch operation after 60s delay **Use `/flush` to skip the wait!** ### Scenario 3: Demo with Testing Settings With `BATCH_SIZE=3, BATCH_TIMEOUT_SECONDS=15`: - Send test message - Wait 15 seconds OR send 2 more messages - Processed automatically - Can query with `/ask` immediately after --- ## Cost vs Speed Trade-offs ### Processing every message individually (BATCH_SIZE=1, BATCH_TIMEOUT_SECONDS=0): - ✅ Instant results - ❌ More API calls (expensive) - ❌ More LLM tokens used - ❌ Slower overall (more network overhead) ### Batching messages (BATCH_SIZE=5+, BATCH_TIMEOUT_SECONDS=60): - ✅ Fewer API calls (cheaper) - ✅ More context for LLM (better signal extraction) - ✅ More efficient - ❌ Slight delay before messages are searchable --- ## Current Status Check To see if messages are waiting in buffer: ``` /flush ``` If buffer is empty: ``` ✅ No messages in buffer (already processed) ``` If messages are waiting: ``` ⚡ Processing 3 buffered message(s)... ✅ Processed 3 messages → 5 signals extracted. ``` --- ## Debugging If messages aren't being found by `/ask`: 1. **Check if they're in the buffer:** ``` /flush ``` 2. **Check if signals were extracted:** ``` /digest ``` Shows total signal count - this should increase after processing 3. **Check backend logs:** Look for: ``` INFO: Processed batch: N signals from [group_name] ``` or ``` INFO: Timer flush: N signals from [group_name] ``` 4. **Verify processing happened:** After sending a message, wait BATCH_TIMEOUT_SECONDS, then check `/digest` again --- ## Recommended: Use Faster Settings for Testing For now, update your `.env`: ```bash BATCH_SIZE=2 BATCH_TIMEOUT_SECONDS=10 ``` Then restart the bot: ```bash # Ctrl+C to stop python -m backend.bot.bot ``` Now messages will be processed within 10 seconds (or after just 2 messages). --- ## Quick Reference | Setting | Production | Testing | Demo | |---------|-----------|---------|------| | BATCH_SIZE | 5 | 2 | 3 | | BATCH_TIMEOUT_SECONDS | 60 | 10 | 15 | | Processing delay | ~60s max | ~10s max | ~15s max | | API efficiency | High | Medium | Medium | Use `/flush` anytime you want instant processing!