feat: add solar pro llm and document parser #16099

JuHyung-Son · 2024-09-19T08:51:02Z

Description

Add new llm model and readers api of upstage.

New llm model, solar-pro can be found here https://developers.upstage.ai/docs/apis/chat .
New document parse model Document Parse can be found here https://developers.upstage.ai/docs/apis/document-parse .

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

Yes
No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

Yes
No

Type of Change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Added new unit/integration tests
Added new notebook (that tests end-to-end)
I stared at the code and made sure it makes sense

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran make format; make lint to appease the lint gods

logan-markewich · 2024-09-19T15:58:20Z

llama-index-integrations/llms/llama-index-llms-upstage/llama_index/llms/upstage/base.py

@@ -207,3 +231,50 @@ def get_num_tokens_from_message(self, messages: Sequence[ChatMessage]) -> int:
                )
        num_tokens += tokens_suffix
        return num_tokens
+
+    @llm_retry_decorator
+    def _chat(self, messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponse:


curious why we subclass _chat, but not any other method like _stream_chat or _achat, etc.

I guess you only support parse documents with .chat() ?

There was a mistake in following llamaindex interface. Added some more methods.

logan-markewich · 2024-09-19T15:59:07Z

llama-index-integrations/llms/llama-index-llms-upstage/llama_index/llms/upstage/base.py

+        for i, doc in enumerate(docs):
+            file_title = file_titles[min(i, len(file_titles) - 1)]
+            document_contents += f"{file_title}:\n{doc.text}\n\n"
+        print("DOCUMENT CONTENTS", document_contents)


lets remove this print

logan-markewich · 2024-09-19T15:59:50Z

llama-index-integrations/llms/llama-index-llms-upstage/pyproject.toml

 tokenizers = "^0.19.1"
 llama-index-core = "^0.11.0"
+llama-index-llms-openai = "^0.2.0"


we need to add the upstage reader as a dependency now right?

(should also bump the version of the integration package here)

Or, maybe we need to publish the reader first actually

Yeah you are right. we should publish reader first.

we need to add the upstage reader as a dependency now right?

yes, we need to add it

logan-markewich · 2024-09-19T16:00:29Z

llama-index-integrations/llms/llama-index-llms-upstage/poetry.lock

no need to check this in

logan-markewich · 2024-09-19T16:00:39Z

llama-index-integrations/readers/llama-index-readers-upstage/poetry.lock

no need to check this in

logan-markewich

Looks good, just a few things to clean up

…into solar-pro

JuHyung-Son added 2 commits September 19, 2024 17:34

feat: add solar pro llm and document parser

226f621

bump DP version

352cadd

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Sep 19, 2024

Merge branch 'main' into solar-pro

8fbe5ef

logan-markewich reviewed Sep 19, 2024

View reviewed changes

logan-markewich self-assigned this Sep 19, 2024

JuHyung-Son added 6 commits October 8, 2024 22:02

remove poetry.lock

16b8139

add dependency

7aabd5c

add _stream_chat

4b76ace

Merge branch 'solar-pro' of https://github.com/JuHyung-Son/llama_index …

fb11104

…into solar-pro

Merge branch 'main' into solar-pro

f64cfd4

override some methods

a0fd60c

JuHyung-Son requested a review from logan-markewich October 8, 2024 13:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add solar pro llm and document parser #16099

feat: add solar pro llm and document parser #16099

JuHyung-Son commented Sep 19, 2024

logan-markewich Sep 19, 2024

logan-markewich Sep 19, 2024 •

edited

Loading

JuHyung-Son Oct 8, 2024 •

edited

Loading

logan-markewich Sep 19, 2024

logan-markewich Sep 19, 2024

logan-markewich Sep 19, 2024

logan-markewich Sep 19, 2024

JuHyung-Son Oct 8, 2024

logan-markewich Sep 19, 2024

logan-markewich Sep 19, 2024

logan-markewich left a comment

feat: add solar pro llm and document parser #16099

Are you sure you want to change the base?

feat: add solar pro llm and document parser #16099

Conversation

JuHyung-Son commented Sep 19, 2024

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Choose a reason for hiding this comment

logan-markewich Sep 19, 2024 • edited Loading

Choose a reason for hiding this comment

JuHyung-Son Oct 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

logan-markewich left a comment

Choose a reason for hiding this comment

logan-markewich Sep 19, 2024 •

edited

Loading

JuHyung-Son Oct 8, 2024 •

edited

Loading