2026-05-18 — views

Recursive Superintelligence emerges with $650M to build self-improving AI

Read this because NVIDIA and AMD on the same cap table is the buried signal: the bet is on a workload (recursive search over architectures) that burns cycles on whichever silicon is available — not a model family loyal to one vendor.

$650M raise at $4.65B, pre-product, <30 staff. GV/Greycroft led; NVIDIA AND AMD both joined. Thesis: AI that automates its own architecture search.

A new frontier-model lab, Recursive Superintelligence, emerged from stealth on May 13 with $650M raised at a $4.65B valuation — pre-product, pre-revenue, fewer than 30 employees. The thesis is the most concrete bet so far on closing the loop from “AI assists research” to “AI does research.”

The cap table is the story

Lead investors	Strategic participants
GV (Alphabet)	NVIDIA
Greycroft	AMD

Two leading chip vendors on the same Series A is unusual. The usual pattern is exclusive — Microsoft + OpenAI, Google + Anthropic, Amazon + Anthropic — because the cloud underwrites both compute commit and equity in one line. Here, NVIDIA and AMD are both buying optionality on a workload that will run on whichever silicon delivers FLOPs first.

The founding team

Founder	Prior role
Richard Socher	ex-Chief Scientist, Salesforce
Yuandong Tian	ex-Director, Meta FAIR
Tim Rocktäschel	ex-DeepMind
Jeff Clune	DeepMind, OpenAI
Josh Tobin	ex-OpenAI
Tim Shi	early Cresta, OpenAI

A team weighted toward open-ended search and meta-learning (Clune, Rocktäschel) rather than the now-standard “post-train a base model” lineage.

What “recursive self-improvement” actually means in practice

The technical bet, distilled from public statements:

Architecture search. Today, picking a new transformer variant takes a small army of researchers running ablations for months. The lab’s claim: a Level-1 system that proposes architectural changes, runs the experiments, and reads its own loss curves to decide what to keep.
Training-recipe optimization. Learning-rate schedule, data mix, curriculum — the metalanguage of training itself becomes the optimization target, not just the weights.
Evaluation generation. The model generates its own benchmark variants to detect overfitting and capability gaps that human-curated benchmarks miss.

If even part of this works at scale, the time-to-next-model collapses from quarters to weeks.

Roadmap

Now → mid-2026: Build the Level-1 autonomous training system. Internal only.
Mid-2026 public launch: Targeting a customer-visible deliverable. Unspecified whether that’s a model API, an architecture-search service, or a research-as-a-service product.
No revenue commitments disclosed. Burn rate at ~30 people + frontier-scale compute likely $5–15M/quarter.

Why this matters now

Three reasons the timing is not random:

Foundation-lab gross margins crossed 70% (Anthropic’s recent disclosure). The headroom to fund speculative architecture work is real for the first time in this cycle.
The compute-overhang argument is breaking. If models keep generalizing on raw compute alone, you don’t need recursive self-improvement. If they’re hitting diminishing returns, recursive search becomes the lever — and the AI-2027-style timelines that depend on RSI become quantitatively plausible.
Investor positioning. A $4.65B mark on a 30-person pre-revenue lab is the same comparable bracket SSI (Sutskever’s lab) hit. The bet is that there’s room for 3–5 such labs, not just one.

Practitioner note

For builders shipping today, the practical takeaways:

Don’t pre-position your stack for “AI builds the model.” That timeline is multi-year at best, and the loser’s game is rebuilding your application layer for an architecture that won’t ship. Build for Claude / GPT / Gemini families as they exist this quarter.
Do track the architecture-search publications. Papers from the founding team’s prior labs (FAIR’s open-ended-search line, DeepMind’s meta-learning work) are the leading indicators. If a public architecture-search benchmark suddenly jumps 15+ points, that’s the signal that recursion-style methods crossed a threshold.
Watch the GPU-vs-TPU-vs-MI400 narrative. A workload that’s silicon-agnostic by design (which RSI essentially is) erodes the moat of any single accelerator vendor. If Recursive’s product launches and proves the workload portable, the “NVIDIA premium” compresses.

The under-considered angle: the team composition is a leading indicator on the field’s belief about RSI feasibility. When researchers with Clune-Tian-Rocktäschel pedigrees walk away from FAIR / DeepMind / OpenAI to bet a career on it, the field’s median estimate of “is this real?” has quietly moved.