Claude 3 API: tool use
Claude 3 offers cost-efficient tool in SOTA level.
Introduction
Claude 3 tool use is priced in the following way. The API call is priced exactly the same as a normal API call, but an additional “system prompt tokens” are added on top:
- Claude 3 Opus: 395 tokens
- Claude 3 Sonnet: 159 tokens
- Claude 3 Haiku 264 tokens.
All APIs consume additional tokens when tools are used. These extra tokens include tool parameters and tool content blocks.
Claude 3 offers compared to OpenAI equal and arguably better performance in terms of quality of the response, speed and price in non-tool use cases for LLMs and VLMs.
Based on today’s general release — Claude 3 matches with GPT-4 in the tool use performance as well.
So, let’s get started.
Tool use
Let’s import the libraries
#!pip install anthropic #first time only
import anthropic
import os
import base64
import httpx
Import the API key from the environmental variable. Start the client object.
key = "anthropic_key"
client = anthropic.Anthropic(api_key=os.getenv(key))
We are now ready to make API calls.
response = client.beta.tools.messages.create(
model="claude-3-haiku-20240307",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit of temperature, either \"celsius\" or \"fahrenheit\""
}
},
"required": ["location"]
}
}
],
messages=[{"role": "user", "content": "What is the weather like in San Francisco?"}]
)
I can now read the API initial response.
print(response)
The initial response includes “stop_reason”: “tool_use”, which refers the API is waiting to receive back from the user side the tool result.
In essence, we have so far received user request, Claude API has converted this into response, which defines a need for using a tool. So, let’s define a tool response, which we could receive back from a weather-tool API:
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": "toolu_xxxxxxxxxxxxxxxx",
"content": "65 degrees"
}
]
}
So, I can now send the weather tool-API result back to Claude API, so it can generate a response back to the end user. The API call is the same, except we add two additional lines of “messages”:
- Claude API previous response with “assistant”-role
- The weather-tool API response as a “user”-role and the “tool_use_id” of the previous Claude API call.
So, the final API call looks the following
response_final = client.beta.tools.messages.create(
model="claude-3-haiku-20240307",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit of temperature, either \"celsius\" or \"fahrenheit\""
}
},
"required": ["location"]
}
}
],
messages=[{"role": "user", "content": "What is the weather like in San Francisco?"},
{"role": "assistant", "content": response.content},
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": "ADD_HERE_TOOL_USE_ID_FROM_PRIOR_API_CALL",
"content": [{"type": "text", "text": "65 degrees"}]
}
]
}]
)
The result is the final response:
print(response_final.content[0].text)
We have now responded to the end user.
Conclusion
In total, the entire flow consumed using the Claude 3 Haiku model, in total, which I think is very efficient usage of tokens:
- 891 input tokens
- 106 output tokens
I think this is great news, because so far only GPT-4 offered sufficient level performance in tool use.
Claude 3 offers now high-quality tool use in affordable price.
References
Teemu Maatta, Github. https://github.com/tmgthb/LLMs.