feat: [anthropic] Claude 3.7 Sonnet with extended thinking #1370

salman1993 · 2025-02-25T00:11:46Z

Claude 3.7 Sonnet works out of the box without extended thinking (changes in this PR not needed). To enable extended thinking, we have set some env vars:

cargo build

# configure Anthropic provider with model 'claude-3-7-sonnet-latest'

# start a session
GOOSE_CLI_SHOW_THINKING=true ANTHROPIC_THINKING_ENABLED=true ./target/debug/goose session

# test on a string
GOOSE_CLI_SHOW_THINKING=true ANTHROPIC_THINKING_ENABLED=true ./target/debug/goose run --text "can you explain the code in crates/goose-cli/src/session/output.rs?"

see longer discussion post on reasoning: #1300

salman1993 · 2025-02-25T22:21:03Z

tested in goose-server with this script:

Build & start goosed: cargo build; ANTHROPIC_THINKING_ENABLED=true ./target/debug/goosed agent
Run script in new terminal

#!/bin/bash
set -euxo pipefail
IFS=$'\n\t'

# Send request to create an agent
curl --request POST \
  --url http://localhost:3000/agent \
  --header 'Content-Type: application/json' \
  --header 'X-Secret-Key: test' \
  --data '{
    "version": "truncate",
    "provider": "anthropic"
  }'

sleep 5

# Add a system
curl --request POST \
  --url http://localhost:3000/extensions/add \
  --header 'Content-Type: application/json' \
  --header 'X-Secret-Key: test' \
  --data '{
    "type": "builtin",
    "name": "developer"
  }'

sleep 5

# Send a user message 
curl --request POST \
  --url http://localhost:3000/reply \
  --header 'Accept: text/event-stream' \
  --header 'Content-Type: application/json' \
  --header 'X-Secret-Key: test' \
  --header 'x-protocol: data' \
  --data '{
  "messages": [
    {
      "role": "user",
      "created": 1740670518,
      "content": [
        {
          "type": "text",
          "text": "what tools do you have? be concise"
        }
      ]
    }
  ]
}'

Output

data: {"type":"Message","message":{"role":"assistant","created":1740672176,"content":[ 
{"type":"thinking","thinking":"I need to identify what tools are available to me based on the function specifications provided. Let me list them concisely:\n\nFrom the function specifications, I have:\n\n1. `developer__shell` - Execute shell commands\n2. `developer__text_editor` - Edit, view, or create files\n3. `developer__list_windows` - List available window titles for screenshots\n4. `developer__screen_capture` - Capture screenshots of displays or windows\n\nThese tools appear to be part of the \"developer\" extension that allows me to edit code files, run shell commands, and capture screenshots.","signature":"EugBCkYIARgCIkDUanI6laxqz0y/0tsfGT20+sdfdsfdf+sadasd+JABiEcbXHYESBV0qhAXGfpDCjBCzjdfEL/2/c3OxYD/aELuB5WF4CEm0bdSCL9I54GsFmdQ=="}, 
{"type":"text","text":"## Available Tools\n\nI currently have access to these tools from the developer extension:\n\n1. **Shell** - Run shell commands\n2. **Text Editor** - View and edit files\n3. **List Windows** - List available window titles \n4. **Screen Capture** - Take screenshots\n\nThese tools allow me to help with code development, file management, and visual debugging."}]}}

data: {"type":"Finish","reason":"stop"}

Penagwin · 2025-02-27T18:32:46Z

crates/goose/src/providers/formats/anthropic.rs

@@ -243,10 +274,13 @@ pub fn create_request(
        return Err(anyhow!("No valid messages to send to Anthropic API"));
    }

+    // https://docs.anthropic.com/en/docs/about-claude/models/all-models#model-comparison-table
+    // Claude 3.7 supports max output tokens up to 8192


The output tokens can now be up to 128k if you add a beta header (this is new with Sonnet 3.7 but its a little hidden).

I'm surprised there isn't an error since the default thinking budget in this PR is 16000 which is less than 8192?

This feature can be enabled by passing an anthropic-beta header of output-128k-2025-02-19.

https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#extended-output-capabilities-beta

we set max_tokens to be the sum of (max_tokens + budget_tokens)

Okay that makes sense then, I see on line 325

@Penagwin i added the beta headers. gonna keep the default max_tokens for now cause its pretty high especially if the model doesn't use up all the budget tokens. we will allow users to configure these params in the future

baxen

LGTM!

baxen · 2025-02-26T15:01:59Z

crates/goose-server/src/routes/reply.rs

@@ -247,6 +270,18 @@ async fn stream_message(
                                .await?;
                        }
                    }
+                    MessageContent::Thinking(content) => {


might skip this until we implement? in part because i'm refactoring this

baxen · 2025-02-27T22:25:40Z

crates/goose/src/providers/anthropic.rs

            .json(&payload)
            .send()
            .await?;

        let status = response.status();
        let payload: Option<Value> = response.json().await.ok();

+        if std::env::var("GOOSE_DEBUG").is_ok() {


nit: do we use this standard elsewhere? definitely seems useful but i'm not sure it makes more sense necessarily than tracing

salman1993 added 4 commits February 24, 2025 18:52

stash changes - compiles ok

b18d3df

CLI is working with claude 3.7 extended thinking

40ae485

set env var only for the thinking test

07eeb8b

add vercel data stream parts

e882b97

salman1993 changed the title ~~feat: Claude 3.7 Sonnet~~ feat: Claude 3.7 Sonnet with extended thinking Feb 26, 2025

clippy, bump output tokens to max 8192

1dcd4a9

salman1993 mentioned this pull request Feb 26, 2025

[draft - failing] claude3.7 sonnet with databricks provider #1394

Open

salman1993 requested a review from alexhancock February 27, 2025 14:55

salman1993 marked this pull request as ready for review February 27, 2025 14:55

salman1993 changed the title ~~feat: Claude 3.7 Sonnet with extended thinking~~ feat: [anthropic] Claude 3.7 Sonnet with extended thinking Feb 27, 2025

merge main, resolve conflicts

82ad1fb

Penagwin reviewed Feb 27, 2025

View reviewed changes

salman1993 added 2 commits February 27, 2025 15:02

add anthropic-beta headers for tool use and 128k output tokens

ba484e8

fmt

0dbf2d2

salman1993 requested review from ahau-square and zakiali February 27, 2025 20:05

baxen approved these changes Feb 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: [anthropic] Claude 3.7 Sonnet with extended thinking #1370

feat: [anthropic] Claude 3.7 Sonnet with extended thinking #1370

salman1993 commented Feb 25, 2025 •

edited

Loading

salman1993 commented Feb 25, 2025 •

edited

Loading

Penagwin Feb 27, 2025

salman1993 Feb 27, 2025

Penagwin Feb 27, 2025

salman1993 Feb 27, 2025 •

edited

Loading

baxen left a comment

baxen Feb 26, 2025

baxen Feb 27, 2025

feat: [anthropic] Claude 3.7 Sonnet with extended thinking #1370

Are you sure you want to change the base?

feat: [anthropic] Claude 3.7 Sonnet with extended thinking #1370

Conversation

salman1993 commented Feb 25, 2025 • edited Loading

salman1993 commented Feb 25, 2025 • edited Loading

Output

Penagwin Feb 27, 2025

Choose a reason for hiding this comment

salman1993 Feb 27, 2025

Choose a reason for hiding this comment

Penagwin Feb 27, 2025

Choose a reason for hiding this comment

salman1993 Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

baxen left a comment

Choose a reason for hiding this comment

baxen Feb 26, 2025

Choose a reason for hiding this comment

baxen Feb 27, 2025

Choose a reason for hiding this comment

salman1993 commented Feb 25, 2025 •

edited

Loading

salman1993 commented Feb 25, 2025 •

edited

Loading

salman1993 Feb 27, 2025 •

edited

Loading