-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 4 wait 5 sec #40
base: main
Are you sure you want to change the base?
Conversation
@@ -56,6 +55,11 @@ async def text_handler(update: Update, context: ContextTypes.DEFAULT_TYPE, con: | |||
(chat_id, user_id, text), | |||
) | |||
con.commit() | |||
|
|||
|
|||
async def generate_response(update: Update, con: psycopg2.connect): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Violation of the interface segregation principle: each component in a program needs to have access only to that data which is needs, and no more. Here, replace update
with chat_id
.
An architecturally cleaner solution is to split the function into several smaller ones; we can talk about that later on.
await store_message(update, con) | ||
|
||
if not delaying.locked(): | ||
_ = asyncio.create_task(delay_then_response(update, con)) # run the delay in background |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember I mentioned that this feature will be tricky to get right?
You seem to have a dangerous race condition here that will cause some messages to be reordered and/or temporarily ignored. Consider the following sequence of events:
-
A message arrives. The delay-then-respond task is started.
-
The delay has expired. The LLM is invoked and is being handled in the background.
-
While the LLM call is running, another message arrives (recall that you're running the completion in another task, so the main task can still keep running). The new message is added to the database, but no new task is invoked because the semaphore (your lock) is taken.
-
Once the LLM call is finished, you append the response to the database.
⚠️ ☠️ BUG: the messsages are now reordered in the database! The LLM completion comes after the last user's message that was not used to generate the completion.
-
You send the response and finish the delay-then-respond task.
⚠️ ☠️ BUG: the latest user's message is left without a reply! It is stored in the database but no task is running to generate a response.
Speaking from experience, the likelihood of race conditions increases quickly as you add new states to the program. Here, the culprit is that you added your semaphore. Think of a solution that does not require additional states beside what you already have in the database. Any such solution will be much more robust than what you currently have.
One way to approach it is to always start a new task when a message arrives. The task will then wait, and before generating a response it will check if the last message in the chat comes from the LLM; if so, it just exits doing nothing, as it means that another task already took care of the answer. When inserting the LLM-generated response into the database, you have to check that no reordering is happening. There are several ways to do it:
-
Introduce a new column that specifies the ID of the message that the current message is a response to, null by default. Say, you call it
response_to_id
; then all messages coming from an LLM will specify the latest message from the user considered during the response generation. You will need to modify the message fetching query slightly to order byresponse_to_id
as well. -
Order by timestamp, ensuring that the LLM message has the same timestamp as the latest message considered during the response generation. This is fragile.
-
The simplest: just discard the LLM response if new messages appeared, and let the next task generate it.
There are other solutions as well. You might notice that what I just outlined uses the database as a sort of queue. It may be beneficial to make the queuing aspect explicit in your design, sending freshly received messages into a delayed queue specific to a given chat, that is read by a worker dedicated to that chat. The messages are then used to generate responses, and the results are pushed into the database only when the corresponding LLM completions are ready.
Perhaps it is worth visualizing this on the whiteboard.
closes #4