Performance Metrics #38

jeisenman23 · 2024-11-14T22:35:03Z

This PR adds time tracking capability within ELM so that we can measure how long queries to the database take, how long chat completions take, and how long the chat function takes to invoke.

grantbuster · 2024-11-18T17:15:17Z

elm/wizard.py

+                       token_budget=None,
+                       new_info_threshold=0.7,
+                       convo=False,
+                       timeit=False):


please add this to the parameters (input args) docstrings. Please make sure there are line breaks before Parameters and Returns (looks like those line breaks got removed here, not sure why)

Fixed by adding necessary spacing

grantbuster · 2024-11-18T17:16:20Z

elm/wizard.py

@@ -87,6 +90,9 @@ def engineer_query(self, query, token_budget=None, new_info_threshold=0.7,
        references : list
            The list of references (strs) used in the engineered prompt is
            returned here
+        vector_query_time : float


I know you didnt do this but can you add a docstring for used_index here? not sure why we dont have one.

Added used_index to engineer_query function

grantbuster · 2024-11-18T17:17:41Z

elm/wizard.py

@@ -184,7 +192,8 @@ def chat(self, query,
            valid ref_col.
        return_chat_obj : bool
            Flag to only return the ChatCompletion from OpenAI API.
-
+        timeit : bool
+            Flag to return the performance metrics on API calls.
        Returns


Same general comments about docstrings here.

Fixed by adding spaces

grantbuster · 2024-11-18T17:21:27Z

elm/wizard.py

+        total_chat_time = start_chat_time - end_time
+        performance = {
+            "total_chat_time": total_chat_time,
+            "chat_completion_time": chat_completion_time,


am i understanding correctly that the chat completion time is time to get first response from the LLM? That's great and important but if the LLM is in "stream" mode, does this actually work or does the API return something immediately but it only starts responding in the for chunk in response generator loop? If this is the case, maybe we should have the end time be calculated at the first chunk generation in the response object.

Yes, you are correct in all of this. in the chat method, we default stream to True. I moved the chat completions metrics to after this loop. Therefore, the logic is: start_timer, start completions, stream completions, stop timer. This change jives with your concern!

grantbuster · 2024-11-18T17:22:39Z

elm/wizard.py

@@ -152,10 +160,10 @@ def chat(self, query,
             token_budget=None,
             new_info_threshold=0.7,
             print_references=False,
-             return_chat_obj=False):
+             return_chat_obj=False,
+             timeit=False):


why don't we just always have timeit=True? I think we should just remove this as a kwarg and always return performance metrics. This will break anything that expects a certain number of outputs but the compute is basically free and it's useful information.

Fixed by removing argument

grantbuster · 2024-12-17T16:56:42Z

hey @jeisenman23 there are still a few outstanding comments on this PR. Can you respond to the comments and make the appropriate edits?

I think this is good to go!

jeisenman23 · 2024-11-21T20:03:26Z

elm/wizard.py

+        total_chat_time = start_chat_time - end_time
+        performance = {
+            "total_chat_time": total_chat_time,
+            "chat_completion_time": chat_completion_time,


Yes, you are correct in all of this. in the chat method, we default stream to True. I moved the chat completions metrics to after this loop. Therefore, the logic is: start_timer, start completions, stream completions, stop timer. This change jives with your concern!

jeisenman23 · 2025-01-02T16:35:39Z

elm/web/osti.py

@@ -196,8 +196,8 @@ def _get_first(self):
                   .format(self._response.status_code,
                           self._response.reason))
            raise RuntimeError(msg)
-        first_page = self._response.json()
-
+        raw_text = self._response.text.encode('utf-8').decode('unicode-escape')


When hitting the OSTI API, GitHub actions was having trouble interpreting the json response. This is because the delimiters were messed up -- sometimes it was '/' and others '//'. this code solves it by encoding the response into utf-8, where no matter the delimiter, the encoding will be the same. Then we can decode properly.

jeisenman23 · 2025-01-02T16:55:57Z

elm/wizard.py

+                       token_budget=None,
+                       new_info_threshold=0.7,
+                       convo=False,
+                       timeit=False):


Fixed by adding necessary spacing

jeisenman23 · 2025-01-02T16:56:28Z

elm/wizard.py

@@ -87,6 +90,9 @@ def engineer_query(self, query, token_budget=None, new_info_threshold=0.7,
        references : list
            The list of references (strs) used in the engineered prompt is
            returned here
+        vector_query_time : float


Added used_index to engineer_query function

jeisenman23 · 2025-01-02T16:57:45Z

elm/wizard.py

@@ -152,10 +160,10 @@ def chat(self, query,
             token_budget=None,
             new_info_threshold=0.7,
             print_references=False,
-             return_chat_obj=False):
+             return_chat_obj=False,
+             timeit=False):


Fixed by removing argument

jeisenman23 · 2025-01-02T17:28:25Z

elm/wizard.py

@@ -184,7 +192,8 @@ def chat(self, query,
            valid ref_col.
        return_chat_obj : bool
            Flag to only return the ChatCompletion from OpenAI API.
-
+        timeit : bool
+            Flag to return the performance metrics on API calls.
        Returns


Fixed by adding spaces

jeisenman23 added 9 commits November 14, 2024 15:15

adds timeit functionality to chat and query functions

5a2e0be

reformatting time metrics

86e2a51

trying to fix lint

93a7d97

fixing lint

4e4dc4b

reducing complexity

fab34d9

removing white spaces

785ec4d

fixing lint

b0429a4

fixing return statement

a41b030

fixing docs

dba79f8

grantbuster self-requested a review November 15, 2024 20:20

grantbuster requested changes Nov 18, 2024

View reviewed changes

jeisenman23 added 7 commits November 21, 2024 12:56

removing timeit as optional argument

dc4579a

changing performance metrics according to stream mode:

bfdd13f

update test

96be266

change test to fit performance metrics

6d37186

change test to fit performance metrics

4aa6fb2

adding back query into chat

0ac1ca9

removing await

f8c3d1a

jeisenman23 added 11 commits December 20, 2024 11:35

fixing test

4018c59

fixing test

4ca2970

removing whitespace

34c58c0

fixing space

abd7000

finicky flake8 error fix

d94403c

fixing elm tests

63070ef

ensuring test cases

20fb602

reversing - statement

15c417a

removing whitespace

0186fdf

removing whitespace

d4a9bf0

fixing line issue

fc88d7a

jeisenman23 and others added 15 commits January 2, 2025 09:28

fixing osti bug

878e4f0

adding spaces for engineer query

b496352

adding spaces for chat function

991d3e3

remove trailing whitespaces

af86612

Merge branch 'main' into time

ae37420

fixing OSTI bug

3232c02

removing comments for flake

5334535

making line shorter

8f4cdcc

line too long

8d15faa

adding blank line

8348af3

rerun of actions

667fe05

changing first

80be899

attempting to fix osti

564f5d4

attempt to fix OSTI in multiple envs

bc196f1

removing test and fixing test

022f734

jeisenman23 commented Jan 6, 2025

View reviewed changes

jeisenman23 added 12 commits January 6, 2025 15:06

inputting local change that works

371dcc2

fixing lint

6549134

debug statement

ab7449c

attempt to fix escape sequence

96c3b87

attempting to fix str

24b41c9

fixing escape

7f55760

getting get pages to work

4eac5dc

clean code

2ed61b9

fixing linter

7819a13

fixing linter

80c92b6

fixing linter

0e6b9f9

fixing over indent

18a4465

ppinchuk mentioned this pull request Jan 8, 2025

Expose timeout parameter #42

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Metrics #38

Performance Metrics #38

jeisenman23 commented Nov 14, 2024

grantbuster Nov 18, 2024

jeisenman23 Jan 2, 2025

jeisenman23 Jan 6, 2025

grantbuster Nov 18, 2024

jeisenman23 Jan 2, 2025

grantbuster Nov 18, 2024

jeisenman23 Jan 2, 2025

grantbuster Nov 18, 2024

jeisenman23 Nov 21, 2024

grantbuster Nov 18, 2024

jeisenman23 Jan 2, 2025

grantbuster commented Dec 17, 2024 •

edited by jeisenman23

Loading

jeisenman23 Nov 21, 2024

jeisenman23 Jan 2, 2025

jeisenman23 Jan 2, 2025

jeisenman23 Jan 2, 2025

jeisenman23 Jan 2, 2025

jeisenman23 Jan 2, 2025

Performance Metrics #38

Are you sure you want to change the base?

Performance Metrics #38

Conversation

jeisenman23 commented Nov 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

grantbuster commented Dec 17, 2024 • edited by jeisenman23 Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

grantbuster commented Dec 17, 2024 •

edited by jeisenman23

Loading