2026-06-18 — views
Physical AI Data Flywheel 2026 — Tesla 6B-Plus Supervised FSD Miles vs Waymo 30M Driverless Miles: The Training Data Benchmark
Tesla has 6B-plus supervised FSD miles from 6M vehicles. Waymo has 30M driverless miles. Scale vs quality is the core Physical AI training data race.
Article 173 in the Physical AI Benchmark Series — Training Data and the Data Flywheel
The most fundamental asymmetry in the Physical AI training race is not compute power, valuation, or regulatory approval — it is data. Tesla has accumulated an estimated 6 billion-plus cumulative supervised Full Self-Driving (FSD) miles from a fleet of approximately 6 million consumer vehicles as of mid-2026 (est.). Waymo has accumulated approximately 25–35 million cumulative fully driverless commercial miles (est.) across its robotaxi fleet in Phoenix, San Francisco, Los Angeles, and Austin. The raw numbers give Tesla a scale advantage of roughly 200:1. But the data is not equivalent: every one of Waymo’s miles was driven completely autonomously, with no human ever in a position to intervene. Every one of Tesla’s miles had a licensed human driver present, ready to override. This article benchmarks which data advantage is more strategically durable, what each type of data actually trains, and what the flywheel looks like as both fleets grow.
Section 1 — The Two Data Flywheels: Scale vs. Purity
| Dimension | Tesla FSD data flywheel | Waymo driverless data flywheel |
|---|---|---|
| Total miles (est.) | 6 billion-plus cumulative supervised FSD miles (est. as of mid-2026) | est. 25–35 million cumulative driverless commercial miles (est. as of mid-2026) |
| Miles per week (est.) | est. tens of millions of supervised miles per week (6M vehicles times average FSD usage) | est. 150,000-plus weekly paid rides times average 3–5 miles per ride = est. 450,000–750,000 driverless miles per week |
| Data generation ratio | Tesla generates est. 100–200x more miles per week than Waymo (est.) | Waymo generates far fewer total miles but all are fully autonomous |
| Driver present? | Yes — licensed human driver must supervise FSD at all times; driver can disengage at any moment | No — zero human intervention; vehicle makes every decision autonomously |
| Data label quality | Supervised miles contain both successful AI decisions AND human overrides (disengagements); the override moments are the most valuable training signal | Fully autonomous miles: every decision the AI made was the actual outcome — no human correction signal, but the AI must be robust enough to handle every scenario without help |
| What the data trains | FSD: trains the neural network to match human driving behavior and recover when it would have needed intervention; disengagement events auto-label edge cases for retraining | Waymo: trains the system to handle the full distribution of real-world scenarios completely autonomously — including recovery from its own mistakes without human help |
| Data engine (Tesla) | Tesla’s Data Engine: FSD trips are video-recorded; onboard model auto-labels frames; low-confidence frames are flagged and sent to cloud for human review; auto-labeling scales; human review focuses on hard cases | Waymo’s equivalent: all driverless trips are logged; any scenario requiring remote ops attention is flagged; simulation generates synthetic edge cases |
| Flywheel compounding | More Tesla vehicles → more FSD miles → more training data → better FSD → higher attach rate → more revenue → more R&D → better FSD | More Waymo driverless rides → more driverless miles → more edge cases discovered → better system → more permits → more cities → more rides |
The scale ratio is striking. Tesla is adding an estimated 2–3 billion supervised miles per year; Waymo is adding an estimated 25–40 million driverless miles per year. Tesla’s raw mile generation rate exceeds Waymo’s by roughly two orders of magnitude. But the 200:1 ratio in miles does not translate to a 200:1 advantage in training signal quality — because the two mile types are training for different things.
Section 2 — What Supervised vs. Driverless Miles Actually Train Differently
| Training dimension | What Tesla supervised miles train | What Waymo driverless miles train | Strategic implication |
|---|---|---|---|
| Normal driving | Human-matched driving behavior: lane-keeping, speed, following distance, intersection handling — all calibrated to how humans drive | Waymo’s own policy for normal driving: developed independently of human behavior norms; optimized for passenger comfort, safety, and efficiency | Both are high-quality for normal driving; Tesla’s is more human-like, Waymo’s is potentially more optimal |
| Edge case discovery | Human disengagement = auto-labeled edge case; 6B miles generates millions of edge-case examples at scale; edge cases are rare per mile but abundant at fleet scale | Driverless operation: Waymo encounters edge cases in the wild without a human backstop; the system must handle them; each handled edge case is a proof of robustness; remote ops attention = edge case flag | Tesla discovers more total edge cases (scale); Waymo’s edge cases are more valuable individually (proved handleable without human) |
| Recovery from mistakes | Human catches mistakes before they become incidents; the AI never learns to recover from deep failure states without human help | Waymo must recover from its own mistakes autonomously; this trains recovery behavior that supervised systems do not learn | Waymo’s recovery training is a structural advantage for fully driverless operation; supervised systems have a gap here |
| Rare weather and conditions | Tesla fleet in all 50 states plus international: snow, ice, fog, heavy rain, construction zones — broad environmental coverage at scale | Waymo fleet in 4 Sun Belt and mild-climate cities (Phoenix, SF, LA, Austin) — limited weather diversity vs. Tesla’s global fleet | Tesla has a significant weather diversity advantage; Waymo’s Phoenix and SF data is high-quality but narrow geographically |
| Night and low-light | 6M vehicles operate 24/7 including nights; massive night driving dataset | 150,000-plus weekly rides including 24/7 SF operation; but smaller fleet means less absolute night data | Tesla has more absolute night miles; Waymo’s night data from SF’s 24/7 operation is high-quality urban night data |
| Novel scenarios | At 6B miles, Tesla has encountered millions of unusual scenarios; Data Engine surfaces the ones the model got wrong | At 30M driverless miles, Waymo encounters novel scenarios that must be handled without human help — higher bar for what counts as “handled” | Different but complementary; neither has a clear winner on novel scenario training |
| Long tail coverage | Tesla: 6B miles generates enormous long-tail coverage at the cost of data quality per mile (supervised) | Waymo: 30M driverless miles generates deep long-tail coverage at much higher per-mile quality | Waymo’s long-tail coverage is proven (vehicle handled it); Tesla’s is observed but with human safety net |
The critical distinction is the meaning of “handled.” In Tesla’s supervised data, an edge case “handled” means a human driver was present and did not need to intervene — or did intervene, generating a disengagement. In Waymo’s driverless data, an edge case “handled” means the autonomous system navigated it to completion without any human involvement. These are different proofs of capability.
Section 3 — The Disengagement Data: Tesla’s Most Valuable Signal
| Disengagement dimension | Detail | Notes |
|---|---|---|
| What a disengagement is | A human FSD user takes control of the vehicle — either voluntarily (does not like what FSD is doing) or because FSD initiated a handoff | Each disengagement is an auto-labeled training example: “at this moment, in this context, the human decided FSD was not handling the situation correctly” |
| Disengagement rate trend (est.) | FSD v12/v13 critical disengagement rate: est. 0.03–0.05 per 1,000 miles (est.); meaning est. 1 disengagement every 20,000–33,000 miles of FSD operation (est.) | Every time the rate improves, there are fewer training examples per mile — but at 6B total miles, the absolute count is still massive |
| How Tesla uses disengagement data | Disengagement frames are auto-labeled and fed back to training pipeline; a single disengagement event generates hundreds of frames of “FSD was about to do X, human corrected” training signal | This creates a closed feedback loop that is unique to supervised driving: human behavior is the training signal |
| Waymo equivalent | Waymo has no disengagement data (there is no human to disengage); instead, remote ops contact rates serve as a proxy for “situation the system found challenging” | Remote ops contacts are rarer than Tesla disengagements and represent higher-severity situations |
| Why disengagement data matters for FSD | Tesla is training an AI to drive like a human; human corrections tell the AI exactly when and how it deviated from human judgment; this feedback loop has driven FSD from 1 disengagement per 500 miles (v10 est.) to est. 1 per 20,000-plus miles (v13 est.) | The improvement trajectory suggests the feedback loop is working extremely well; FSD’s rapid improvement curve is partly explained by the quality of the disengagement training signal |
| The supervised-to-driverless gap | Training on supervised miles optimizes for “human would not intervene”; the transition to driverless requires “system never needs human intervention” — these are subtly but importantly different optimization targets | Tesla’s FSD trained on supervised data must transfer to driverless operation; this is non-trivial and is the central challenge of Tesla’s autonomous transition |
The disengagement feedback loop is Tesla’s most structurally unique training asset. No other company in the world has 6 million consumers generating real-time human corrections to AI driving decisions at scale. Each correction is a labeled training example that Waymo cannot collect — because Waymo has no human supervisors in its vehicles to generate the correction signal.
Section 4 — Fleet Scale vs. Fleet Quality: Which Flywheel Wins?
| Scenario | Tesla advantage | Waymo advantage | Who wins in this scenario |
|---|---|---|---|
| Improving normal driving performance | Enormous dataset; fast iteration cycles; billions of miles to learn from | High-quality autonomous data; each mile proves robustness | Both strong; Tesla’s scale advantage is decisive for rare-but-real scenarios at scale |
| Achieving driverless operation | Must bridge the supervised-to-driverless gap; large fleet but data optimized for supervised | Already operating driverless; data directly reflects driverless performance requirements | Waymo: driverless training data is more directly relevant to driverless deployment |
| Expanding to new geographies | 6M vehicles already in new geographies: snow, ice, international roads | Must physically deploy fleet to new city; cannot pre-train on data it does not have | Tesla: massive geographic coverage advantage for new environment pre-training |
| Weather robustness | Snow, ice, fog, rain data at scale (US plus international fleet) | Sun Belt focus; limited severe weather data | Tesla: decisive weather diversity advantage |
| Edge case recovery without human | Supervised data: edge cases observed but human prevents worst outcomes | Driverless data: edge cases handled completely autonomously | Waymo: recovery training without human intervention is uniquely driverless |
| Training efficiency | FSD v12: end-to-end model; billions of miles train one neural network | Waymo: modular systems plus end-to-end components; smaller dataset but higher per-mile information density | Roughly equal; different architectures extract different value from their respective datasets |
| Overall flywheel verdict | Tesla wins on scale, speed, and geographic coverage | Waymo wins on data quality for driverless-specific capabilities | The winner of the long-term flywheel depends on whether the supervised-to-driverless gap can be bridged by scale (Tesla’s bet) or whether driverless-specific training data is irreplaceable (Waymo’s structural position) |
The central strategic question is whether Tesla’s approach — training on supervised miles at massive scale and using end-to-end neural networks to generalize — can produce driverless-quality driving behavior without ever collecting driverless-specific training data. FSD v12 and v13’s improvement trajectory suggests the approach is working faster than many expected. But the experiment is not finished.
Section 5 — Data Flywheel Ramp Index: Key Metrics to Track
| KPI | Tesla Q2 2026 | Waymo Q2 2026 | H2 2026 trajectory |
|---|---|---|---|
| Cumulative miles (est.) | est. 6B-plus supervised (est.) | est. 25–35M driverless (est.) | Tesla: est. plus 2–3B miles per year; Waymo: est. plus 25–40M miles per year |
| Weekly mile generation (est.) | est. tens of millions per week (est.) | est. 450,000–750,000 driverless miles per week (est.) | Tesla grows with FSD attach rate; Waymo grows with fleet size |
| Disengagement rate (FSD est.) | est. 0.03–0.05 per 1,000 miles (est.) | N/A (no supervised driving) | Key metric: rate continues declining → FSD approaching driverless threshold |
| Driverless miles per week (est.) | est. 0 driverless miles (Austin supervised only) | est. 450,000–750,000 driverless miles per week (est.) | Tesla: first driverless miles in Austin pending FMVSS; Waymo: grows with Gen 6 fleet ramp |
| Model improvement rate | FSD v10 → v13: disengagement rate improved est. 40x in est. 3 years (est.) | Waymo: disengagement rate = 0 (not applicable); operational capability improvements tracked via new city launches and edge case handling | Both showing rapid improvement; different metrics |
| Data advantage durability | Tesla’s scale advantage grows with every new FSD vehicle sold; durable as long as FSD attach rates hold | Waymo’s driverless quality advantage durable as long as Tesla cannot bridge the supervised-to-driverless gap without driverless-specific training data | The key question for both: can Tesla’s scale bridge the quality gap, or does driverless require driverless data? FSD v13’s trajectory suggests scale is winning — but the experiment is not over |
Section 6 — About This Series
This is article 173 in the Physical AI Benchmark Series. Previous articles have covered the ramp index, the humanoid race, unit economics, global competition, HD mapping, fleet operations, software and OTA, insurance and liability, consumer demand, competitive moats, Cybercab versus Model Y, safety data, Waymo Gen 6, Optimus manufacturing, scorecard snapshots, the 2030 forecast scenarios, the investor framework, Waymo’s city expansion pipeline, Tesla’s state approval map, AV weather and climate constraints, the talent war, the regulatory calendar, robotaxi fare pricing, the AV data flywheel comparison, the humanoid deployment tracker, the supply chain analysis, the consumer adoption demand index, Waymo’s standalone valuation and IPO analysis, the Tesla Dojo versus cloud compute build-vs-buy analysis, the Waymo-Uber partnership analysis, the Waymo Gen 6 vehicle deep dive, the Waymo driver software architecture, Tesla Optimus factory ramp, Tesla FSD timeline history, and Waymo’s city expansion playbook.
This article adds the data flywheel dimension: the 200:1 scale ratio between Tesla’s supervised miles and Waymo’s driverless miles, what each mile type actually trains, and why the supervised-to-driverless gap is the most important open question in the Physical AI data race.
Reminder: All mileage figures, disengagement rates, fleet sizes, and projections in this article are estimates based on publicly available information, company disclosures, and industry analysis. They are labeled “(est.)” throughout. They are not investment recommendations. Conduct your own due diligence and consult a licensed financial adviser before making any investment decisions.
Sources
- Tesla FSD cumulative miles — Tesla AI Day and earnings calls ↗
- Waymo driverless miles and rides — Waymo blog ↗
- Tesla FSD Data Engine — Tesla AI infrastructure ↗
- Waymo safety report — Waymo ↗
- FSD disengagement rates — Tesla AI Day and public disclosures ↗