2026-05-09 — 次浏览 · 4 models
挑對工具 — 6 個函式可選的多步驟任務
Prompt
你有以下工具可用:`web_search(query)`、`fetch_url(url)`、`read_file(path)`、`run_shell(cmd)`、`query_database(sql)`、`send_email(to, subject, body)`。
任務:找出 Anthropic Claude 4.7 釋出公告的最新發布日期,抓取該頁完整文字,計算 'tool' 一字出現幾次,並把次數 email 到 ops@example.com,主旨「Claude 4.7 tool count」。
回應**僅** JSON 陣列,依執行順序的工具呼叫:[{"tool": "...", "args": {...}}, ...]。不要散文、不要 markdown、不要其他文字。 Notes
測試模型是否選對 4 個工具的正確順序:web_search → fetch_url →(在地計數,無工具)→ send_email。扣分項:多餘工具、順序錯誤、漏 send_email、有前言。
Results — 4 models
claude-opus-4-7 WIN · 1480ms · in 264 · out 145 · $0.013
[
{"tool": "web_search", "args": {"query": "Anthropic Claude 4.7 release announcement date site:anthropic.com"}},
{"tool": "fetch_url", "args": {"url": "<url-from-prior-result>"}},
{"tool": "send_email", "args": {"to": "ops@example.com", "subject": "Claude 4.7 tool count", "body": "<count>"}}
] gpt-5 TIE · 1320ms · in 264 · out 168 · $0.011
[
{"tool": "web_search", "args": {"query": "Claude 4.7 release announcement Anthropic"}},
{"tool": "fetch_url", "args": {"url": "<top result url>"}},
{"tool": "run_shell", "args": {"cmd": "echo $TEXT | grep -o tool | wc -l"}},
{"tool": "send_email", "args": {"to": "ops@example.com", "subject": "Claude 4.7 tool count", "body": "<count>"}}
] gemini-3-pro LOSS · 1810ms · in 264 · out 240 · $0.009
I'll help you with that. Here's the plan: [
{"tool": "web_search", "args": {"query": "Claude 4.7"}},
{"tool": "web_search", "args": {"query": "Anthropic announcement"}},
{"tool": "fetch_url", "args": {"url": "..."}},
{"tool": "send_email", "args": {"to": "ops@example.com", "subject": "Result", "body": "..."}}
] kimi-k2.6 WIN · 980ms · in 264 · out 138 · $0.001
[
{"tool": "web_search", "args": {"query": "Anthropic Claude 4.7 release announcement"}},
{"tool": "fetch_url", "args": {"url": "<from search>"}},
{"tool": "send_email", "args": {"to": "ops@example.com", "subject": "Claude 4.7 tool count", "body": "<count>"}}
]