2026-05-27 — views

An OpenAI reasoning model disproved an 80-year-old Erdős conjecture — and it wasn't a math-specific model

Read this because The headline is "AI does math." The real signal: it came from a general-purpose reasoning model, not a math-specific system — and it disproved an 80-year belief by constructing a counterexample no human had found. One result is not a revolution, but the generality is the story.

OpenAI says an internal reasoning model autonomously disproved the Erdős 1946 unit-distance conjecture (May 20) — a first for AI, verified by mathematicians.

On May 20, 2026, OpenAI announced that an internal reasoning model had autonomously disproved a central conjecture in discrete geometry — the unit-distance problem first posed by Paul Erdős in 1946. For nearly eight decades, mathematicians believed the answer was essentially settled. The model showed it wasn’t.

The problem, in one paragraph

Place n points in a plane. How many pairs can be exactly distance 1 apart? Erdős asked this in 1946, and for decades the working belief was that square-grid-like arrangements were essentially optimal, capping the count near a known bound. The OpenAI model disproved that belief by constructing an infinite family of point configurations that beat the grid — achieving on the order of n^(1+δ) unit-distance pairs for a fixed δ > 0. The construction leaned on deep machinery from algebraic number theory (Golod–Shafarevich theory and infinite class field towers). A Princeton mathematician, Will Sawin, subsequently refined the exponent to δ ≈ 0.014.

Why this one is different

AI has “done math” before — but almost always through systems built for math: theorem provers, models fine-tuned on proof corpora, or pipelines scaffolded to search a specific problem. OpenAI’s claim here is sharper on two axes:

General-purpose, not specialized. The result came from a new general-purpose reasoning model, not a system trained specifically for mathematics or pointed at the unit-distance problem in particular.
A counterexample, not a checked proof. Disproving a long-standing conjecture means finding a construction no human had found — a creative, generative act, not the verification of a known path. That is the harder, more “research-like” half of mathematics.

External mathematicians, including Sawin and commentary from Gil Kalai, examined and built on the result — the disproof was independently scrutinized, not taken on OpenAI’s word.

Why it matters

The significance isn’t that one conjecture fell. It’s the transfer: a model trained to reason broadly produced a genuinely novel mathematical object in a domain it wasn’t built for. If general reasoning capability now spills into frontier research math, the same capability plausibly spills into algorithm design, materials, and protocol verification — anywhere the bottleneck is searching a vast space of constructions for one that works.

Practitioner note

Read the framing carefully before extrapolating. “Autonomously disproved” is a strong claim, and the honest scope is: one problem, one construction, verified by humans after the fact. The durable takeaway for builders isn’t “models can do math now” — it’s that the gap between checking a solution and generating a non-obvious one is closing in at least one hard domain. If your workflow has a step where the answer is expensive to find but cheap to verify, that is exactly the shape where this class of model is starting to earn its keep. Build the verifier; let the model search.

The under-considered angle

The quiet question this raises is about taste. Generating a valid counterexample required choosing which exotic machinery (class field towers) to reach for among countless dead ends — the kind of judgment mathematicians call research taste, long assumed to be irreducibly human. A single result doesn’t prove a model has it. But it does move the debate from “can AI follow a proof?” to “can AI decide what to try?” — and that is the more consequential question for whether these systems become collaborators or just very fast calculators.