Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(anthropic): Add Anthropic PDF support (document type) in invoke #7496

Merged
merged 8 commits into from
Jan 18, 2025

Conversation

adhambadr
Copy link
Contributor

Since claude-3-5-sonnet-20240620 PDF Support has been added to Anthropic's message types.

You are able to send the PDF document to Anthropic and they do text extraction, Image conversion and supply the LLM with both (Text + Screenshot) of each page to do deep dive analysis, text extraction and more. Its pretty neat and handy especially in doing structured output and I added support for it in the Langchain Ecosystem as right now using document type throws an unsupported type error before passing it to the LLM.

I added the source type document support as well as simplifying the source object to just pass the base64 or the object exactly as in Anthropic's documentation.
I added a working example inside yarn example examples/src/prompts/pdf_document.ts

Here is an example usage:

import { ChatAnthropic } from "@langchain/anthropic";

const llm = new ChatAnthropic({
    model: "claude-3-5-sonnet-20240620",
 // Key 
});

// Local file
const file = fs.readFileSync("test.pdf");
const base64 = Buffer.from(file).toString("base64");
// Or Load remotely (web environment): 
const res = await fetch("https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf")
const buffer = await res.arrayBuffer();
const base64 = Buffer.from(buffer).toString("base64");

const prompt = "Summarize for me the contents of this document"; 
const {content} = await llm.invoke([ 
  {
     role : "user",
     content :  [
        {
          type: "text",
          text: prompt,
        },
        {
          type: "document",
          source: base64,
        }
      ]
   }
]);

console.log(content);

It's my first PR to this project so apologies if I missed something crucial, feedback or improvements are welcomed, as all as the shoutout to my twitter

Supported models as of Jan 2025:

  1. claude-3-5-sonnet-20240620
  2. claude-3-5-sonnet-20241022

Copy link

vercel bot commented Jan 10, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-docs ✅ Ready (Inspect) Visit Preview Jan 18, 2025 8:55am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ⬜️ Ignored (Inspect) Jan 18, 2025 8:55am

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. auto:improvement Medium size change to existing code to handle new use-cases labels Jan 10, 2025
@jacoblee93
Copy link
Collaborator

Ah nice!

Copy link
Collaborator

@jacoblee93 jacoblee93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging this - see comment!

libs/langchain-anthropic/src/utils/message_inputs.ts Outdated Show resolved Hide resolved
@camwardy
Copy link

Thanks for adding this @adhambadr, hopefully this can get merged soon as it'd be really useful for us!

@jacoblee93 jacoblee93 changed the title Add Anthropic PDF support (document type) in invoke feat(anthropic): Add Anthropic PDF support (document type) in invoke Jan 18, 2025
@jacoblee93 jacoblee93 merged commit 94467fa into langchain-ai:main Jan 18, 2025
33 of 34 checks passed
@jacoblee93
Copy link
Collaborator

Thank you!

@adhambadr
Copy link
Contributor Author

Thank you!

thanks a lot for the clean up (removing console, unnecessary import etc)!

FilipZmijewski added a commit to FilipZmijewski/langchainjs that referenced this pull request Jan 30, 2025
* Rename auth method in docs

* fix(core): Fix trim messages mutation bug (langchain-ai#7547)

* release(core): 0.3.31 (langchain-ai#7548)

* fix(community): Updated Embeddings URL (langchain-ai#7545)

* fix(community): make sure guardrailConfig can be added even with anthropic models (langchain-ai#7542)

* docs: Fix PGVectorStore import in install dependencies (TypeScript) example (langchain-ai#7533)

* fix(community): Airtable url (langchain-ai#7532)

* docs: Fix typo in OpenAIModerationChain example (langchain-ai#7528)

* docs: Resolves langchain-ai#7483, resolves langchain-ai#7274 (langchain-ai#7505)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* docs: Rename auth method in IBM docs (langchain-ai#7524)

* docs: correct misspelling (langchain-ai#7522)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* release(community): 0.3.25 (langchain-ai#7549)

* feat(azure-cosmosdb): add session context for a user mongodb (langchain-ai#7436)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* release(azure-cosmosdb): 0.2.7 (langchain-ai#7550)

* fix(ci): Fix build (langchain-ai#7551)

* feat(anthropic): Add Anthropic PDF support (document type) in invoke (langchain-ai#7496)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* release(anthropic): 0.3.12 (langchain-ai#7552)

* chore(core,langchain,community): Relax langsmith deps (langchain-ai#7556)

* release(community): 0.3.26 (langchain-ai#7557)

* release(core): 0.3.32 (langchain-ai#7558)

* Release 0.3.12 (langchain-ai#7559)

* Add deployment chat to chat class

* Upadate Watsonx sdk

* Rework interfaces in llms as well

* Bump watsonx-ai sdk version

* Remove unused code

* Add fake auth

---------

Co-authored-by: Jacob Lee <jacoblee93@gmail.com>
Co-authored-by: Jacky Chen <jackychen4@gmail.com>
Co-authored-by: Mohamed Belhadj <medbelh@gmail.com>
Co-authored-by: Brian Ploetz <bploetz@gmail.com>
Co-authored-by: Eduard-Constantin Ibinceanu <ibinceanu.eduard@yahoo.com>
Co-authored-by: Jonathan V <jonathanvelkeneers@hotmail.com>
Co-authored-by: ucev <zhangshuaiyf@icloud.com>
Co-authored-by: crisjy <cjy1994116@163.com>
Co-authored-by: Adham Badr <adhambadr2@gmail.com>
FilipZmijewski added a commit to FilipZmijewski/langchainjs that referenced this pull request Jan 30, 2025
* Rename auth method in docs

* fix(core): Fix trim messages mutation bug (langchain-ai#7547)

* release(core): 0.3.31 (langchain-ai#7548)

* fix(community): Updated Embeddings URL (langchain-ai#7545)

* fix(community): make sure guardrailConfig can be added even with anthropic models (langchain-ai#7542)

* docs: Fix PGVectorStore import in install dependencies (TypeScript) example (langchain-ai#7533)

* fix(community): Airtable url (langchain-ai#7532)

* docs: Fix typo in OpenAIModerationChain example (langchain-ai#7528)

* docs: Resolves langchain-ai#7483, resolves langchain-ai#7274 (langchain-ai#7505)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* docs: Rename auth method in IBM docs (langchain-ai#7524)

* docs: correct misspelling (langchain-ai#7522)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* release(community): 0.3.25 (langchain-ai#7549)

* feat(azure-cosmosdb): add session context for a user mongodb (langchain-ai#7436)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* release(azure-cosmosdb): 0.2.7 (langchain-ai#7550)

* fix(ci): Fix build (langchain-ai#7551)

* feat(anthropic): Add Anthropic PDF support (document type) in invoke (langchain-ai#7496)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* release(anthropic): 0.3.12 (langchain-ai#7552)

* chore(core,langchain,community): Relax langsmith deps (langchain-ai#7556)

* release(community): 0.3.26 (langchain-ai#7557)

* release(core): 0.3.32 (langchain-ai#7558)

* Release 0.3.12 (langchain-ai#7559)

* fix(core): Prevent cache misses from triggering model start callback runs twice (langchain-ai#7565)

* fix(core): Ensure that cached flag in run extras is only set for cache hits (langchain-ai#7566)

* release(core): 0.3.33 (langchain-ai#7567)

* feat(community): Adds graph_document to export list (langchain-ai#7555)

Co-authored-by: quantropi-minh <101128605+quantropi-minh@users.noreply.github.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* fix(langchain): Fix ZeroShotAgent createPrompt with correct formatted tool names (langchain-ai#7510)

* docs: Add document for AzureCosmosDBMongoChatMessageHistory (langchain-ai#7519)

Co-authored-by: root <root@CPC-yangq-FRSGK>

* fix(langchain): Allow pulling hub prompts with associated models (langchain-ai#7569)

* fix(community,aws): Update handleLLMNewToken to include chunk metadata (langchain-ai#7568)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* feat(community): Provide fallback relationshipType in case it is not present in graph_transformer (langchain-ai#7521)

Co-authored-by: quantropi-minh <101128605+quantropi-minh@users.noreply.github.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* docs: Add redirect (langchain-ai#7570)

* fix(langchain,core): Add shim for hub mustache templates with nested input variables (langchain-ai#7581)

* fix(chat-models): honor disableStreaming even for `generateUncached` (langchain-ai#7575)

* release(core): 0.3.34 (langchain-ai#7584)

* feat(langchain): Add hub entrypoint with automatic dynamic entrypoint of models (langchain-ai#7583)

* chore(ollama): Export `OllamaEmbeddingsParams` interface (langchain-ai#7574)

* docs: Clarify tool creation process in structured outputs documentation (langchain-ai#7578)

Co-authored-by: Sahar Shemesh <sahar.shemesh@zoominfo.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* fix(community): Set awaitHandlers to true in upstash ratelimit (langchain-ai#7571)

Co-authored-by: Jacob Lee <jacoblee93@gmail.com>

* fix(core): Fix trim messages mutation (langchain-ai#7585)

* feat(openai): Make only AzureOpenAI respect Azure env vars, remove class defaults, update withStructuredOutput defaults (langchain-ai#7535)

* fix(community): Make postgresConnectionOptions optional in PostgresRecordManager (langchain-ai#7580)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* release(community): 0.3.27 (langchain-ai#7586)

* release(ollama): 0.1.5 (langchain-ai#7587)

* Release 0.3.13 (langchain-ai#7588)

* release(openai): 0.4.0 (langchain-ai#7589)

* release(core): 0.3.35 (langchain-ai#7590)

* fix(ci): Update lock (langchain-ai#7591)

* feat(core): Allow passing returnDirect in tool wrapper params (langchain-ai#7594)

* release(core): 0.3.36 (langchain-ai#7595)

* fix(openai): Revert Azure default withStructuredOutput changes (langchain-ai#7596)

* release(openai): 0.4.1 (langchain-ai#7597)

* feat(openai): Refactor to allow easier subclassing (langchain-ai#7598)

* release(openai): 0.4.2 (langchain-ai#7599)

* feat(deepseek): Adds Deepseek integration (langchain-ai#7604)

* release(deepseek): 0.0.1 (langchain-ai#7608)

* feat: update Novita AI doc (langchain-ai#7602)

* Add deployment chat to chat class

* feat(langchain): Add DeepSeek to initChatModel (langchain-ai#7609)

* Release 0.3.14 (langchain-ai#7611)

* fix: Add test for pdf uploads anthropic (langchain-ai#7613)

* feat: Update google genai to support file uploads (langchain-ai#7612)

* chore(google-genai): Drop .only in test (langchain-ai#7614)

* release(google-genai): 0.1.7 (langchain-ai#7615)

* Upadate Watsonx sdk

* fix(core): Fix stream events bug when errors are thrown too quickly during iteration (langchain-ai#7617)

* release(core): 0.3.37 (langchain-ai#7619)

* fix(langchain): Fix Groq import for hub (langchain-ai#7620)

* docs: update README/intro

* Release 0.3.15

* feat(community): improve support for Tavily search tool args (langchain-ai#7561)

* feat(community): Add boolean metadata type support in Supabase structured query translator (langchain-ai#7601)

* feat(google-genai): Add support for fileUri in media type in Google GenAI (langchain-ai#7621)

Co-authored-by: Jacob Lee <jacoblee93@gmail.com>

* release(google-genai): 0.1.8 (langchain-ai#7628)

* release(community): 0.3.28 (langchain-ai#7629)

* Rework interfaces in llms as well

* Bump watsonx-ai sdk version

* Remove unused code

* Add fake auth

* Fix broken changes

---------

Co-authored-by: Jacob Lee <jacoblee93@gmail.com>
Co-authored-by: Jacky Chen <jackychen4@gmail.com>
Co-authored-by: Mohamed Belhadj <medbelh@gmail.com>
Co-authored-by: Brian Ploetz <bploetz@gmail.com>
Co-authored-by: Eduard-Constantin Ibinceanu <ibinceanu.eduard@yahoo.com>
Co-authored-by: Jonathan V <jonathanvelkeneers@hotmail.com>
Co-authored-by: ucev <zhangshuaiyf@icloud.com>
Co-authored-by: crisjy <cjy1994116@163.com>
Co-authored-by: Adham Badr <adhambadr2@gmail.com>
Co-authored-by: Minh Ha <hlminh2000@gmail.com>
Co-authored-by: quantropi-minh <101128605+quantropi-minh@users.noreply.github.com>
Co-authored-by: Chi Thu Le <thu2004@yahoo.se>
Co-authored-by: fatmelon <708842811@qq.com>
Co-authored-by: root <root@CPC-yangq-FRSGK>
Co-authored-by: Mohamad Mohebifar <mohebifar@users.noreply.github.com>
Co-authored-by: David Duong <david@duong.cz>
Co-authored-by: Brace Sproul <braceasproul@gmail.com>
Co-authored-by: Matus Gura <contact@matusgura.com>
Co-authored-by: Sahar Shemesh <48128579+saharis9988@users.noreply.github.com>
Co-authored-by: Sahar Shemesh <sahar.shemesh@zoominfo.com>
Co-authored-by: Cahid Arda Öz <cahidardaooz@hotmail.com>
Co-authored-by: Jason <ggbbddjm@gmail.com>
Co-authored-by: vbarda <vadym@langchain.dev>
Co-authored-by: Vadym Barda <vadim.barda@gmail.com>
Co-authored-by: Hugo Borsoni <44852104+hugoleborso@users.noreply.github.com>
Co-authored-by: Arman Ghazaryan <arm.ghazaryan01@gmail.com>
Co-authored-by: Andy <andy+github@savage.hk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:improvement Medium size change to existing code to handle new use-cases size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants