-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: [anthropic] Claude 3.7 Sonnet with extended thinking #1370
base: main
Are you sure you want to change the base?
Conversation
tested in goose-server with this script:
#!/bin/bash
set -euxo pipefail
IFS=$'\n\t'
# Send request to create an agent
curl --request POST \
--url http://localhost:3000/agent \
--header 'Content-Type: application/json' \
--header 'X-Secret-Key: test' \
--data '{
"version": "truncate",
"provider": "anthropic"
}'
sleep 5
# Add a system
curl --request POST \
--url http://localhost:3000/extensions/add \
--header 'Content-Type: application/json' \
--header 'X-Secret-Key: test' \
--data '{
"type": "builtin",
"name": "developer"
}'
sleep 5
# Send a user message
curl --request POST \
--url http://localhost:3000/reply \
--header 'Accept: text/event-stream' \
--header 'Content-Type: application/json' \
--header 'X-Secret-Key: test' \
--header 'x-protocol: data' \
--data '{
"messages": [
{
"role": "user",
"created": 1740670518,
"content": [
{
"type": "text",
"text": "what tools do you have? be concise"
}
]
}
]
}' Output
|
@@ -243,10 +274,13 @@ pub fn create_request( | |||
return Err(anyhow!("No valid messages to send to Anthropic API")); | |||
} | |||
|
|||
// https://docs.anthropic.com/en/docs/about-claude/models/all-models#model-comparison-table | |||
// Claude 3.7 supports max output tokens up to 8192 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output tokens can now be up to 128k if you add a beta header (this is new with Sonnet 3.7 but its a little hidden).
I'm surprised there isn't an error since the default thinking budget in this PR is 16000
which is less than 8192
?
This feature can be enabled by passing an
anthropic-beta
header ofoutput-128k-2025-02-19
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we set max_tokens to be the sum of (max_tokens + budget_tokens)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay that makes sense then, I see on line 325
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Penagwin i added the beta headers. gonna keep the default max_tokens for now cause its pretty high especially if the model doesn't use up all the budget tokens. we will allow users to configure these params in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@@ -247,6 +270,18 @@ async fn stream_message( | |||
.await?; | |||
} | |||
} | |||
MessageContent::Thinking(content) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might skip this until we implement? in part because i'm refactoring this
.json(&payload) | ||
.send() | ||
.await?; | ||
|
||
let status = response.status(); | ||
let payload: Option<Value> = response.json().await.ok(); | ||
|
||
if std::env::var("GOOSE_DEBUG").is_ok() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do we use this standard elsewhere? definitely seems useful but i'm not sure it makes more sense necessarily than tracing
Claude 3.7 Sonnet works out of the box without extended thinking (changes in this PR not needed). To enable extended thinking, we have set some env vars:
see longer discussion post on reasoning: #1300