Skip to content
AI-Daily-Builder

Tags · #coding-agents

OpenAI Acquires Ona (Formerly Gitpod) to Give Codex Agents Persistent, Secure Cloud Environments

Read this because The race in agentic coding has shifted from model quality to runtime. OpenAI is buying the boring, hard infrastructure — persistence, sandboxing, audit — that turns a clever agent into a deployable one.

OpenAI is acquiring Ona — the cloud dev-environment firm formerly known as Gitpod — to let Codex agents run longer tasks in persistent, audited environments.

xAI finishes training Grok V9-Medium, a 1.5T-param model tuned on Cursor developer data

Read this because The headline isn't the 1.5T parameter count — it's the corpus. Tuning a frontier model on Cursor's real developer workflows is a direct bid for the coding layer Claude and Codex dominate. Treat the benchmarks and timeline as vendor-sourced until weights or an API ship.

Musk says xAI's 1.5T-param Grok V9-Medium finished training (May 25), ~3x its production model and trained on Cursor dev data — mid-June release expected.

Cognition raises $1B at $26B — the agent-as-headcount bet, with 90% of its own code AI-written

Read this because A ~53x ARR multiple is a bet on agent-as-headcount, not agent-as-tool. The flywheel is the proof and the risk: Cognition writes ~90% of its own code with Devin, so its growth and its demo are the same thing — until growth slows.

Cognition, maker of the Devin coding agent, raised $1B+ at a $26B valuation (May 27) — 2.5x in 8 months, $492M ARR, ~90% of its own code AI-written.

Google Gemini 3.5 Flash beats last quarter's Pro flagship on agentic tasks

Read this because The signal is the price-performance inversion: a budget tier now out-runs last quarter's flagship on agentic throughput-per-dollar. If you sized infra around Pro-tier pricing, your unit economics just improved without a code change.

At I/O 2026, Gemini 3.5 Flash beats Gemini 3.1 Pro on coding+agent benchmarks at $1.50/$9 per 1M tokens. Terminal-Bench 76.2% vs 70.3%. 4x faster, half cost.

xAI ships Grok Build CLI: 8 concurrent subagents, 70.8% SWE-Bench, $99 intro price

Read this because The 8-parallel-subagent design, not the benchmark score, is the structural choice worth watching. If it holds, the cost model flips from "tokens per task" to "tasks per wall-clock minute" — every Claude Code/Codex shop needs to re-benchmark on throughput, not accuracy.

May 14 public beta. SWE-Bench 70.8%, 256K context, $0.20/$1.50 per 1M tokens, $99 intro. 8 subagents on git branches turns the race four-way.

Tip