Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/accurate retrieval benchmark #97

Merged

Conversation

gaganahluwalia
Copy link
Collaborator

@gaganahluwalia gaganahluwalia commented Oct 16, 2024

Description

This includes changes for the benchmarking tests.

Changelog

  • Changes to Web Agent, File Agent and adding history and context.

from src.websockets.user_confirmer import UserConfirmer
from src.websockets.confirmations_manager import confirmations_manager
# from src.websockets.user_confirmer import UserConfirmer
# from src.websockets.confirmations_manager import confirmations_manager

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were used for the Human COnfirmation loop which is broken at the moment, so I commented it out and when next time someone picks up InferGPT they can fix it.

is_confirmed = await confirmer.confirm("Would you like to generate a graph?")
# confirmer = UserConfirmer(confirmations_manager)
is_confirmed = True
# await confirmer.confirm("Would you like to generate a graph?")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saem like above, these were used for the Human COnfirmation loop which is broken at the moment, so I commented it out and when next time someone picks up InferGPT they can fix it.

Copy link
Collaborator

@hsauve-scottlogic hsauve-scottlogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great Gagan! I've just put a few comments in

confirmer = UserConfirmer(confirmations_manager)
is_confirmed = await confirmer.confirm("Would you like to generate a graph?")
# confirmer = UserConfirmer(confirmations_manager)
is_confirmed = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we saying that it is always confirmed? At the moment I don't get the message asking for confirmation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I didn't want to totally remove the Human loop, so I am always setting it to True. THis true means the user is always returning true and there won't be any pop up for the user.

async def invoke(self, utterance: str) -> str:
user_prompt = engine.load_prompt("intent", question=utterance)
chat_history = await self.read_file_core("conversation-history.txt")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the file doesn't exist, what happens?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THen it won't return the file and history will be blank and it will be created at the end..

answer_to_user = await answer_user_ques(search_query, llm, model)
answer_result = json.loads(answer_to_user)
if answer_result["status"] == "error":
return ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it returns an empty string, should it return an error? Just wondering about what actually happens in that case

@@ -103,6 +147,76 @@ async def web_general_search(search_query, llm, model) -> str:
async def web_pdf_download(pdf_url, llm, model) -> str:
return await web_pdf_download_core(pdf_url, llm, model)

async def web_scrape_core(url: str) -> str:
try:
logger.info(f"Scraping the price of the book from URL: {url}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the message be a bit more generic?

}


Important: If the question is realted to real time data, the LLM should provide is_valid is false.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add an extra line please

}

Important:
Please always create the last intent to append the retrieved info in a 'conversation-history.txt' file and make sure this history file is always 'conversation-history.txt'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add an additional line

}

Important:
Please always create the last intent to append the retrieved info in a 'conversation-history.txt' file and make sure this history file is always 'conversation-history.txt'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Please always create the last intent to append the retrieved info in a 'conversation-history.txt' file and make sure this history file is always 'conversation-history.txt'
Please always create the last intent to append the retrieved info in a 'conversation-history.txt' file and make sure this history file is always named 'conversation-history.txt'

"result": "81,462 million",
"steps": "1. Convert 81.462 billion to million by multiplying by 1000. Round the result to the nearest million.",
"reasoning": "Rounding to the nearest million ensures that the result is represented in a more practical figure, without exceeding or falling short of the actual value."
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add an extra line

@@ -1,4 +1,5 @@
Reply only in json with the following format:
Reply only in json with the following format, in the tool_paramters please include the curreny and measuring scale used in the content provided.:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Reply only in json with the following format, in the tool_paramters please include the curreny and measuring scale used in the content provided.:
Reply only in json with the following format, in the tool_parameters please include the currency and measuring scale used in the content provided.:

Task: Please find tesla's revenue every year since its creation.
Answer: Tesla's annual revenue history from FY 2008 to FY 2023 is available, with figures for 2008 through 2020 taken from previous annual reports.
Response: False
Reasoning: The answer is not prvoding any actual figures but just talk about the figures.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Reasoning: The answer is not prvoding any actual figures but just talk about the figures.
Reasoning: The answer is not providing any actual figures but just talk about the figures.

is_confirmed = await confirmer.confirm("Would you like to generate a graph?")
# confirmer = UserConfirmer(confirmations_manager)
is_confirmed = True
# await confirmer.confirm("Would you like to generate a graph?")
if not is_confirmed:
raise Exception("The user did not confirm to creating a graph.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Gagan, this may be out of scope here. Is there a known timeout for the confirmation message? e.g. It pops up and if if the user doesn't interact with it in 20 seconds then it disappears and the exception is raised.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Max, I have commented out the code, so there shouldn't be any pop up before the Graph generation at all now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noice!

Copy link
Collaborator

@mnyamunda-scottlogic mnyamunda-scottlogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇 Nice!

@gaganahluwalia gaganahluwalia merged commit 70524a8 into release/improve-reliability Oct 18, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants