mirror of https://github.com/arkorty/B.Tech-Project-III.git synced 2026-04-19 12:41:48 +00:00

Files

Arkaprabha Chakraborty 8be37d3e92 init

2026-04-05 00:43:23 +05:30

4.4 KiB

Raw Permalink Blame History

Message Batching Configuration

ThirdEye batches messages before processing to improve efficiency and reduce API costs. Here's how to configure it for your needs.

How Batching Works

Messages are processed when either condition is met:

Buffer reaches BATCH_SIZE messages (processes immediately)
Timer reaches BATCH_TIMEOUT_SECONDS (processes whatever is in buffer)

This prevents processing every single message individually while ensuring messages don't wait forever.

Configuration Variables

Edit .env:

BATCH_SIZE=5                  # Process after N messages
BATCH_TIMEOUT_SECONDS=60      # Or wait this many seconds

Recommended Settings

For Production (efficiency priority):

BATCH_SIZE=5
BATCH_TIMEOUT_SECONDS=60

Batches 5 messages together (reduces API calls)
Waits up to 60 seconds for more messages
Good for active groups with frequent messages

For Testing/Demo (speed priority):

BATCH_SIZE=3
BATCH_TIMEOUT_SECONDS=15

Processes after just 3 messages
Only waits 15 seconds max
Better for testing and demos where you want quick feedback

For Low-Volume Groups:

BATCH_SIZE=3
BATCH_TIMEOUT_SECONDS=30

Smaller batch size (triggers faster)
Moderate timeout (30 seconds)

For High-Volume Groups:

BATCH_SIZE=10
BATCH_TIMEOUT_SECONDS=90

Larger batches (more efficient)
Longer timeout (less frequent processing)

Manual Flush Command

If you don't want to wait for the timer, use:

/flush

This immediately processes all buffered messages for the current group.

When to use /flush:

During testing/demos (want instant results)
After important messages (need to query them right away)
When buffer has accumulated but timer hasn't triggered yet

Example Scenarios

Scenario 1: Active Group Chat (10 messages in 30 seconds)

With BATCH_SIZE=5, BATCH_TIMEOUT_SECONDS=60:

First 5 messages → processed immediately
Next 5 messages → processed immediately
Total: 2 batch operations

Scenario 2: Slow Group Chat (1 message, then silence)

With BATCH_SIZE=5, BATCH_TIMEOUT_SECONDS=60:

1 message arrives → starts 60s timer
60 seconds pass → timer triggers → processes the 1 message
Total: 1 batch operation after 60s delay

Use /flush to skip the wait!

Scenario 3: Demo with Testing Settings

With BATCH_SIZE=3, BATCH_TIMEOUT_SECONDS=15:

Send test message
Wait 15 seconds OR send 2 more messages
Processed automatically
Can query with /ask immediately after

Cost vs Speed Trade-offs

Processing every message individually (BATCH_SIZE=1, BATCH_TIMEOUT_SECONDS=0):

✅ Instant results
❌ More API calls (expensive)
❌ More LLM tokens used
❌ Slower overall (more network overhead)

Batching messages (BATCH_SIZE=5+, BATCH_TIMEOUT_SECONDS=60):

✅ Fewer API calls (cheaper)
✅ More context for LLM (better signal extraction)
✅ More efficient
❌ Slight delay before messages are searchable

Current Status Check

To see if messages are waiting in buffer:

/flush

If buffer is empty:

✅ No messages in buffer (already processed)

If messages are waiting:

⚡ Processing 3 buffered message(s)...
✅ Processed 3 messages → 5 signals extracted.

Debugging

If messages aren't being found by /ask:

Check if they're in the buffer:
```
/flush
```
Check if signals were extracted:
```
/digest
```
Shows total signal count - this should increase after processing

Check backend logs: Look for:

INFO: Processed batch: N signals from [group_name]

INFO: Timer flush: N signals from [group_name]

Verify processing happened: After sending a message, wait BATCH_TIMEOUT_SECONDS, then check /digest again

Recommended: Use Faster Settings for Testing

For now, update your .env:

BATCH_SIZE=2
BATCH_TIMEOUT_SECONDS=10

Then restart the bot:

# Ctrl+C to stop
python -m backend.bot.bot

Now messages will be processed within 10 seconds (or after just 2 messages).

Quick Reference

Setting	Production	Testing	Demo
BATCH_SIZE	5	2	3
BATCH_TIMEOUT_SECONDS	60	10	15
Processing delay	~60s max	~10s max	~15s max
API efficiency	High	Medium	Medium

Use /flush anytime you want instant processing!

4.4 KiB Raw Permalink Blame History