Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 4 wait 5 sec #40

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Issue 4 wait 5 sec #40

wants to merge 8 commits into from

Conversation

songmeo
Copy link
Owner

@songmeo songmeo commented Feb 11, 2025

closes #4

@songmeo songmeo linked an issue Feb 11, 2025 that may be closed by this pull request
@@ -56,6 +55,11 @@ async def text_handler(update: Update, context: ContextTypes.DEFAULT_TYPE, con:
(chat_id, user_id, text),
)
con.commit()


async def generate_response(update: Update, con: psycopg2.connect):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Violation of the interface segregation principle: each component in a program needs to have access only to that data which is needs, and no more. Here, replace update with chat_id.

An architecturally cleaner solution is to split the function into several smaller ones; we can talk about that later on.

await store_message(update, con)

if not delaying.locked():
_ = asyncio.create_task(delay_then_response(update, con)) # run the delay in background
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember I mentioned that this feature will be tricky to get right?

You seem to have a dangerous race condition here that will cause some messages to be reordered and/or temporarily ignored. Consider the following sequence of events:

  • A message arrives. The delay-then-respond task is started.

  • The delay has expired. The LLM is invoked and is being handled in the background.

  • While the LLM call is running, another message arrives (recall that you're running the completion in another task, so the main task can still keep running). The new message is added to the database, but no new task is invoked because the semaphore (your lock) is taken.

  • Once the LLM call is finished, you append the response to the database.

    • ⚠️☠️ BUG: the messsages are now reordered in the database! The LLM completion comes after the last user's message that was not used to generate the completion.
  • You send the response and finish the delay-then-respond task.

    • ⚠️☠️ BUG: the latest user's message is left without a reply! It is stored in the database but no task is running to generate a response.

Speaking from experience, the likelihood of race conditions increases quickly as you add new states to the program. Here, the culprit is that you added your semaphore. Think of a solution that does not require additional states beside what you already have in the database. Any such solution will be much more robust than what you currently have.

One way to approach it is to always start a new task when a message arrives. The task will then wait, and before generating a response it will check if the last message in the chat comes from the LLM; if so, it just exits doing nothing, as it means that another task already took care of the answer. When inserting the LLM-generated response into the database, you have to check that no reordering is happening. There are several ways to do it:

  1. Introduce a new column that specifies the ID of the message that the current message is a response to, null by default. Say, you call it response_to_id; then all messages coming from an LLM will specify the latest message from the user considered during the response generation. You will need to modify the message fetching query slightly to order by response_to_id as well.

  2. Order by timestamp, ensuring that the LLM message has the same timestamp as the latest message considered during the response generation. This is fragile.

  3. The simplest: just discard the LLM response if new messages appeared, and let the next task generate it.

There are other solutions as well. You might notice that what I just outlined uses the database as a sort of queue. It may be beneficial to make the queuing aspect explicit in your design, sending freshly received messages into a delayed queue specific to a given chat, that is read by a worker dedicated to that chat. The messages are then used to generate responses, and the results are pushed into the database only when the corresponding LLM completions are ready.

Perhaps it is worth visualizing this on the whiteboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement delayed responses
2 participants