2026-06-18 — views

Physical AI Data Flywheel 2026 — Tesla 6B-Plus Supervised FSD Miles vs Waymo 30M Driverless Miles: The Training Data Benchmark

Tesla has 6B-plus supervised FSD miles from 6M vehicles. Waymo has 30M driverless miles. Scale vs quality is the core Physical AI training data race.

Article 173 in the Physical AI Benchmark Series — Training Data and the Data Flywheel

The most fundamental asymmetry in the Physical AI training race is not compute power, valuation, or regulatory approval — it is data. Tesla has accumulated an estimated 6 billion-plus cumulative supervised Full Self-Driving (FSD) miles from a fleet of approximately 6 million consumer vehicles as of mid-2026 (est.). Waymo has accumulated approximately 25–35 million cumulative fully driverless commercial miles (est.) across its robotaxi fleet in Phoenix, San Francisco, Los Angeles, and Austin. The raw numbers give Tesla a scale advantage of roughly 200:1. But the data is not equivalent: every one of Waymo’s miles was driven completely autonomously, with no human ever in a position to intervene. Every one of Tesla’s miles had a licensed human driver present, ready to override. This article benchmarks which data advantage is more strategically durable, what each type of data actually trains, and what the flywheel looks like as both fleets grow.

Section 1 — The Two Data Flywheels: Scale vs. Purity

Dimension	Tesla FSD data flywheel	Waymo driverless data flywheel
Total miles (est.)	6 billion-plus cumulative supervised FSD miles (est. as of mid-2026)	est. 25–35 million cumulative driverless commercial miles (est. as of mid-2026)
Miles per week (est.)	est. tens of millions of supervised miles per week (6M vehicles times average FSD usage)	est. 150,000-plus weekly paid rides times average 3–5 miles per ride = est. 450,000–750,000 driverless miles per week
Data generation ratio	Tesla generates est. 100–200x more miles per week than Waymo (est.)	Waymo generates far fewer total miles but all are fully autonomous
Driver present?	Yes — licensed human driver must supervise FSD at all times; driver can disengage at any moment	No — zero human intervention; vehicle makes every decision autonomously
Data label quality	Supervised miles contain both successful AI decisions AND human overrides (disengagements); the override moments are the most valuable training signal	Fully autonomous miles: every decision the AI made was the actual outcome — no human correction signal, but the AI must be robust enough to handle every scenario without help
What the data trains	FSD: trains the neural network to match human driving behavior and recover when it would have needed intervention; disengagement events auto-label edge cases for retraining	Waymo: trains the system to handle the full distribution of real-world scenarios completely autonomously — including recovery from its own mistakes without human help
Data engine (Tesla)	Tesla’s Data Engine: FSD trips are video-recorded; onboard model auto-labels frames; low-confidence frames are flagged and sent to cloud for human review; auto-labeling scales; human review focuses on hard cases	Waymo’s equivalent: all driverless trips are logged; any scenario requiring remote ops attention is flagged; simulation generates synthetic edge cases
Flywheel compounding	More Tesla vehicles → more FSD miles → more training data → better FSD → higher attach rate → more revenue → more R&D → better FSD	More Waymo driverless rides → more driverless miles → more edge cases discovered → better system → more permits → more cities → more rides

The scale ratio is striking. Tesla is adding an estimated 2–3 billion supervised miles per year; Waymo is adding an estimated 25–40 million driverless miles per year. Tesla’s raw mile generation rate exceeds Waymo’s by roughly two orders of magnitude. But the 200:1 ratio in miles does not translate to a 200:1 advantage in training signal quality — because the two mile types are training for different things.

Section 2 — What Supervised vs. Driverless Miles Actually Train Differently

Training dimension	What Tesla supervised miles train	What Waymo driverless miles train	Strategic implication
Normal driving	Human-matched driving behavior: lane-keeping, speed, following distance, intersection handling — all calibrated to how humans drive	Waymo’s own policy for normal driving: developed independently of human behavior norms; optimized for passenger comfort, safety, and efficiency	Both are high-quality for normal driving; Tesla’s is more human-like, Waymo’s is potentially more optimal
Edge case discovery	Human disengagement = auto-labeled edge case; 6B miles generates millions of edge-case examples at scale; edge cases are rare per mile but abundant at fleet scale	Driverless operation: Waymo encounters edge cases in the wild without a human backstop; the system must handle them; each handled edge case is a proof of robustness; remote ops attention = edge case flag	Tesla discovers more total edge cases (scale); Waymo’s edge cases are more valuable individually (proved handleable without human)
Recovery from mistakes	Human catches mistakes before they become incidents; the AI never learns to recover from deep failure states without human help	Waymo must recover from its own mistakes autonomously; this trains recovery behavior that supervised systems do not learn	Waymo’s recovery training is a structural advantage for fully driverless operation; supervised systems have a gap here
Rare weather and conditions	Tesla fleet in all 50 states plus international: snow, ice, fog, heavy rain, construction zones — broad environmental coverage at scale	Waymo fleet in 4 Sun Belt and mild-climate cities (Phoenix, SF, LA, Austin) — limited weather diversity vs. Tesla’s global fleet	Tesla has a significant weather diversity advantage; Waymo’s Phoenix and SF data is high-quality but narrow geographically
Night and low-light	6M vehicles operate 24/7 including nights; massive night driving dataset	150,000-plus weekly rides including 24/7 SF operation; but smaller fleet means less absolute night data	Tesla has more absolute night miles; Waymo’s night data from SF’s 24/7 operation is high-quality urban night data
Novel scenarios	At 6B miles, Tesla has encountered millions of unusual scenarios; Data Engine surfaces the ones the model got wrong	At 30M driverless miles, Waymo encounters novel scenarios that must be handled without human help — higher bar for what counts as “handled”	Different but complementary; neither has a clear winner on novel scenario training
Long tail coverage	Tesla: 6B miles generates enormous long-tail coverage at the cost of data quality per mile (supervised)	Waymo: 30M driverless miles generates deep long-tail coverage at much higher per-mile quality	Waymo’s long-tail coverage is proven (vehicle handled it); Tesla’s is observed but with human safety net

The critical distinction is the meaning of “handled.” In Tesla’s supervised data, an edge case “handled” means a human driver was present and did not need to intervene — or did intervene, generating a disengagement. In Waymo’s driverless data, an edge case “handled” means the autonomous system navigated it to completion without any human involvement. These are different proofs of capability.

Section 3 — The Disengagement Data: Tesla’s Most Valuable Signal

Disengagement dimension	Detail	Notes
What a disengagement is	A human FSD user takes control of the vehicle — either voluntarily (does not like what FSD is doing) or because FSD initiated a handoff	Each disengagement is an auto-labeled training example: “at this moment, in this context, the human decided FSD was not handling the situation correctly”
Disengagement rate trend (est.)	FSD v12/v13 critical disengagement rate: est. 0.03–0.05 per 1,000 miles (est.); meaning est. 1 disengagement every 20,000–33,000 miles of FSD operation (est.)	Every time the rate improves, there are fewer training examples per mile — but at 6B total miles, the absolute count is still massive
How Tesla uses disengagement data	Disengagement frames are auto-labeled and fed back to training pipeline; a single disengagement event generates hundreds of frames of “FSD was about to do X, human corrected” training signal	This creates a closed feedback loop that is unique to supervised driving: human behavior is the training signal
Waymo equivalent	Waymo has no disengagement data (there is no human to disengage); instead, remote ops contact rates serve as a proxy for “situation the system found challenging”	Remote ops contacts are rarer than Tesla disengagements and represent higher-severity situations
Why disengagement data matters for FSD	Tesla is training an AI to drive like a human; human corrections tell the AI exactly when and how it deviated from human judgment; this feedback loop has driven FSD from 1 disengagement per 500 miles (v10 est.) to est. 1 per 20,000-plus miles (v13 est.)	The improvement trajectory suggests the feedback loop is working extremely well; FSD’s rapid improvement curve is partly explained by the quality of the disengagement training signal
The supervised-to-driverless gap	Training on supervised miles optimizes for “human would not intervene”; the transition to driverless requires “system never needs human intervention” — these are subtly but importantly different optimization targets	Tesla’s FSD trained on supervised data must transfer to driverless operation; this is non-trivial and is the central challenge of Tesla’s autonomous transition

The disengagement feedback loop is Tesla’s most structurally unique training asset. No other company in the world has 6 million consumers generating real-time human corrections to AI driving decisions at scale. Each correction is a labeled training example that Waymo cannot collect — because Waymo has no human supervisors in its vehicles to generate the correction signal.

Section 4 — Fleet Scale vs. Fleet Quality: Which Flywheel Wins?

Scenario	Tesla advantage	Waymo advantage	Who wins in this scenario
Improving normal driving performance	Enormous dataset; fast iteration cycles; billions of miles to learn from	High-quality autonomous data; each mile proves robustness	Both strong; Tesla’s scale advantage is decisive for rare-but-real scenarios at scale
Achieving driverless operation	Must bridge the supervised-to-driverless gap; large fleet but data optimized for supervised	Already operating driverless; data directly reflects driverless performance requirements	Waymo: driverless training data is more directly relevant to driverless deployment
Expanding to new geographies	6M vehicles already in new geographies: snow, ice, international roads	Must physically deploy fleet to new city; cannot pre-train on data it does not have	Tesla: massive geographic coverage advantage for new environment pre-training
Weather robustness	Snow, ice, fog, rain data at scale (US plus international fleet)	Sun Belt focus; limited severe weather data	Tesla: decisive weather diversity advantage
Edge case recovery without human	Supervised data: edge cases observed but human prevents worst outcomes	Driverless data: edge cases handled completely autonomously	Waymo: recovery training without human intervention is uniquely driverless
Training efficiency	FSD v12: end-to-end model; billions of miles train one neural network	Waymo: modular systems plus end-to-end components; smaller dataset but higher per-mile information density	Roughly equal; different architectures extract different value from their respective datasets
Overall flywheel verdict	Tesla wins on scale, speed, and geographic coverage	Waymo wins on data quality for driverless-specific capabilities	The winner of the long-term flywheel depends on whether the supervised-to-driverless gap can be bridged by scale (Tesla’s bet) or whether driverless-specific training data is irreplaceable (Waymo’s structural position)

The central strategic question is whether Tesla’s approach — training on supervised miles at massive scale and using end-to-end neural networks to generalize — can produce driverless-quality driving behavior without ever collecting driverless-specific training data. FSD v12 and v13’s improvement trajectory suggests the approach is working faster than many expected. But the experiment is not finished.

Section 5 — Data Flywheel Ramp Index: Key Metrics to Track

KPI	Tesla Q2 2026	Waymo Q2 2026	H2 2026 trajectory
Cumulative miles (est.)	est. 6B-plus supervised (est.)	est. 25–35M driverless (est.)	Tesla: est. plus 2–3B miles per year; Waymo: est. plus 25–40M miles per year
Weekly mile generation (est.)	est. tens of millions per week (est.)	est. 450,000–750,000 driverless miles per week (est.)	Tesla grows with FSD attach rate; Waymo grows with fleet size
Disengagement rate (FSD est.)	est. 0.03–0.05 per 1,000 miles (est.)	N/A (no supervised driving)	Key metric: rate continues declining → FSD approaching driverless threshold
Driverless miles per week (est.)	est. 0 driverless miles (Austin supervised only)	est. 450,000–750,000 driverless miles per week (est.)	Tesla: first driverless miles in Austin pending FMVSS; Waymo: grows with Gen 6 fleet ramp
Model improvement rate	FSD v10 → v13: disengagement rate improved est. 40x in est. 3 years (est.)	Waymo: disengagement rate = 0 (not applicable); operational capability improvements tracked via new city launches and edge case handling	Both showing rapid improvement; different metrics
Data advantage durability	Tesla’s scale advantage grows with every new FSD vehicle sold; durable as long as FSD attach rates hold	Waymo’s driverless quality advantage durable as long as Tesla cannot bridge the supervised-to-driverless gap without driverless-specific training data	The key question for both: can Tesla’s scale bridge the quality gap, or does driverless require driverless data? FSD v13’s trajectory suggests scale is winning — but the experiment is not over

Section 6 — About This Series

This is article 173 in the Physical AI Benchmark Series. Previous articles have covered the ramp index, the humanoid race, unit economics, global competition, HD mapping, fleet operations, software and OTA, insurance and liability, consumer demand, competitive moats, Cybercab versus Model Y, safety data, Waymo Gen 6, Optimus manufacturing, scorecard snapshots, the 2030 forecast scenarios, the investor framework, Waymo’s city expansion pipeline, Tesla’s state approval map, AV weather and climate constraints, the talent war, the regulatory calendar, robotaxi fare pricing, the AV data flywheel comparison, the humanoid deployment tracker, the supply chain analysis, the consumer adoption demand index, Waymo’s standalone valuation and IPO analysis, the Tesla Dojo versus cloud compute build-vs-buy analysis, the Waymo-Uber partnership analysis, the Waymo Gen 6 vehicle deep dive, the Waymo driver software architecture, Tesla Optimus factory ramp, Tesla FSD timeline history, and Waymo’s city expansion playbook.

This article adds the data flywheel dimension: the 200:1 scale ratio between Tesla’s supervised miles and Waymo’s driverless miles, what each mile type actually trains, and why the supervised-to-driverless gap is the most important open question in the Physical AI data race.

Reminder: All mileage figures, disengagement rates, fleet sizes, and projections in this article are estimates based on publicly available information, company disclosures, and industry analysis. They are labeled “(est.)” throughout. They are not investment recommendations. Conduct your own due diligence and consult a licensed financial adviser before making any investment decisions.