Skip to content
AI-Daily-Builder

2026-05-18 views

xAI ships Grok Build CLI: 8 concurrent subagents, 70.8% SWE-Bench, $99 intro price

Read this because The 8-parallel-subagent design, not the benchmark score, is the structural choice worth watching. If it holds, the cost model flips from "tokens per task" to "tasks per wall-clock minute" — every Claude Code/Codex shop needs to re-benchmark on throughput, not accuracy.

May 14 public beta. SWE-Bench 70.8%, 256K context, $0.20/$1.50 per 1M tokens, $99 intro. 8 subagents on git branches turns the race four-way.

xAI pushed its first agentic coding CLI, Grok Build, into public beta on May 14. Elon Musk personally recruited testers on X within hours of launch. The shipping bar is real: 70.8% on SWE-Bench Verified, 256K context, 8 concurrent subagents on separate git branches, and an intro price that materially undercuts every incumbent.

Specs that matter

SpecGrok BuildClaude Code (Sonnet 4.6)OpenAI Codex
SWE-Bench Verified70.8%~70%~68%
Context256K1M (Sonnet 4.6 large)200K
API input$0.20 / 1M$3.00 / 1M$1.50 / 1M
API output$1.50 / 1M$15.00 / 1M$10.00 / 1M
Subscription$99/mo intro, $299/mo standard$20–$200/mo$20–$200/mo
Parallel subagents8 concurrentsub-task spawningsub-task spawning

The API pricing is the most aggressive part. Input at $0.20 per 1M is 15× cheaper than Claude Sonnet 4.6 and 7.5× cheaper than OpenAI Codex. Output at $1.50 per 1M is 10× and 6.7× cheaper respectively.

The 8-subagent design

The structural bet:

The shift in mental model: a coding session stops being “one agent doing one thing slowly” and becomes “8 agents doing 8 things in parallel, each in their own sandbox.” Whether you save wall-clock time depends entirely on how well your task decomposes.

What’s actually new vs prior art

The pricing tactic

$99/mo for 6 months vs. $299/mo standard is a deliberate land-grab. xAI is doing what every late entrant to a category does: trade margin for share. The math:

If Grok Build matches Claude Code on day-to-day tasks (open question — benchmark scores don’t tell the whole story), the per-seat economics force evaluation. The risk is the post-6-month renewal at $299 — xAI is betting that switching cost (codebase context, prompt tuning, workflow muscle memory) keeps teams locked in once the cheap window closes.

Distribution + setup

Distribution is through x.ai/cli — same pattern Anthropic and OpenAI use. No app-store fight, no MDM friction, but no enterprise procurement story either. The product targets individual developers and small teams first; the enterprise SKU is presumably gated behind an SSO + audit-log story xAI hasn’t shipped yet.

Practitioner note

For teams already on Claude Code or Codex:

The under-considered angle: the dev-tools coding-agent market is now a four-way commodity race. When SWE-Bench scores cluster in the 68–71% band across four vendors and API prices vary 15×, the bottleneck stops being model quality and becomes integration depth — how well the agent reads your codebase conventions, your test suite, your CI, your team norms. The next 18 months are about which vendor builds the deepest hooks into your existing stack, not which one tops a benchmark by 2 points.


Sources

Tags

Tip