4.4 KiB
Message Batching Configuration
ThirdEye batches messages before processing to improve efficiency and reduce API costs. Here's how to configure it for your needs.
How Batching Works
Messages are processed when either condition is met:
- Buffer reaches BATCH_SIZE messages (processes immediately)
- Timer reaches BATCH_TIMEOUT_SECONDS (processes whatever is in buffer)
This prevents processing every single message individually while ensuring messages don't wait forever.
Configuration Variables
Edit .env:
BATCH_SIZE=5 # Process after N messages
BATCH_TIMEOUT_SECONDS=60 # Or wait this many seconds
Recommended Settings
For Production (efficiency priority):
BATCH_SIZE=5
BATCH_TIMEOUT_SECONDS=60
- Batches 5 messages together (reduces API calls)
- Waits up to 60 seconds for more messages
- Good for active groups with frequent messages
For Testing/Demo (speed priority):
BATCH_SIZE=3
BATCH_TIMEOUT_SECONDS=15
- Processes after just 3 messages
- Only waits 15 seconds max
- Better for testing and demos where you want quick feedback
For Low-Volume Groups:
BATCH_SIZE=3
BATCH_TIMEOUT_SECONDS=30
- Smaller batch size (triggers faster)
- Moderate timeout (30 seconds)
For High-Volume Groups:
BATCH_SIZE=10
BATCH_TIMEOUT_SECONDS=90
- Larger batches (more efficient)
- Longer timeout (less frequent processing)
Manual Flush Command
If you don't want to wait for the timer, use:
/flush
This immediately processes all buffered messages for the current group.
When to use /flush:
- During testing/demos (want instant results)
- After important messages (need to query them right away)
- When buffer has accumulated but timer hasn't triggered yet
Example Scenarios
Scenario 1: Active Group Chat (10 messages in 30 seconds)
With BATCH_SIZE=5, BATCH_TIMEOUT_SECONDS=60:
- First 5 messages → processed immediately
- Next 5 messages → processed immediately
- Total: 2 batch operations
Scenario 2: Slow Group Chat (1 message, then silence)
With BATCH_SIZE=5, BATCH_TIMEOUT_SECONDS=60:
- 1 message arrives → starts 60s timer
- 60 seconds pass → timer triggers → processes the 1 message
- Total: 1 batch operation after 60s delay
Use /flush to skip the wait!
Scenario 3: Demo with Testing Settings
With BATCH_SIZE=3, BATCH_TIMEOUT_SECONDS=15:
- Send test message
- Wait 15 seconds OR send 2 more messages
- Processed automatically
- Can query with
/askimmediately after
Cost vs Speed Trade-offs
Processing every message individually (BATCH_SIZE=1, BATCH_TIMEOUT_SECONDS=0):
- ✅ Instant results
- ❌ More API calls (expensive)
- ❌ More LLM tokens used
- ❌ Slower overall (more network overhead)
Batching messages (BATCH_SIZE=5+, BATCH_TIMEOUT_SECONDS=60):
- ✅ Fewer API calls (cheaper)
- ✅ More context for LLM (better signal extraction)
- ✅ More efficient
- ❌ Slight delay before messages are searchable
Current Status Check
To see if messages are waiting in buffer:
/flush
If buffer is empty:
✅ No messages in buffer (already processed)
If messages are waiting:
⚡ Processing 3 buffered message(s)...
✅ Processed 3 messages → 5 signals extracted.
Debugging
If messages aren't being found by /ask:
-
Check if they're in the buffer:
/flush -
Check if signals were extracted:
/digestShows total signal count - this should increase after processing
-
Check backend logs: Look for:
INFO: Processed batch: N signals from [group_name]or
INFO: Timer flush: N signals from [group_name] -
Verify processing happened: After sending a message, wait BATCH_TIMEOUT_SECONDS, then check
/digestagain
Recommended: Use Faster Settings for Testing
For now, update your .env:
BATCH_SIZE=2
BATCH_TIMEOUT_SECONDS=10
Then restart the bot:
# Ctrl+C to stop
python -m backend.bot.bot
Now messages will be processed within 10 seconds (or after just 2 messages).
Quick Reference
| Setting | Production | Testing | Demo |
|---|---|---|---|
| BATCH_SIZE | 5 | 2 | 3 |
| BATCH_TIMEOUT_SECONDS | 60 | 10 | 15 |
| Processing delay | ~60s max | ~10s max | ~15s max |
| API efficiency | High | Medium | Medium |
Use /flush anytime you want instant processing!