Skip to content
AI-Daily-Builder

2026-06-18 views

Physical AI Data Flywheel 2026 — Tesla 6B-Plus Supervised FSD Miles vs Waymo 30M Driverless Miles: The Training Data Benchmark

Tesla has 6B-plus supervised FSD miles from 6M vehicles. Waymo has 30M driverless miles. Scale vs quality is the core Physical AI training data race.

Article 173 in the Physical AI Benchmark Series — Training Data and the Data Flywheel

The most fundamental asymmetry in the Physical AI training race is not compute power, valuation, or regulatory approval — it is data. Tesla has accumulated an estimated 6 billion-plus cumulative supervised Full Self-Driving (FSD) miles from a fleet of approximately 6 million consumer vehicles as of mid-2026 (est.). Waymo has accumulated approximately 25–35 million cumulative fully driverless commercial miles (est.) across its robotaxi fleet in Phoenix, San Francisco, Los Angeles, and Austin. The raw numbers give Tesla a scale advantage of roughly 200:1. But the data is not equivalent: every one of Waymo’s miles was driven completely autonomously, with no human ever in a position to intervene. Every one of Tesla’s miles had a licensed human driver present, ready to override. This article benchmarks which data advantage is more strategically durable, what each type of data actually trains, and what the flywheel looks like as both fleets grow.


Section 1 — The Two Data Flywheels: Scale vs. Purity

DimensionTesla FSD data flywheelWaymo driverless data flywheel
Total miles (est.)6 billion-plus cumulative supervised FSD miles (est. as of mid-2026)est. 25–35 million cumulative driverless commercial miles (est. as of mid-2026)
Miles per week (est.)est. tens of millions of supervised miles per week (6M vehicles times average FSD usage)est. 150,000-plus weekly paid rides times average 3–5 miles per ride = est. 450,000–750,000 driverless miles per week
Data generation ratioTesla generates est. 100–200x more miles per week than Waymo (est.)Waymo generates far fewer total miles but all are fully autonomous
Driver present?Yes — licensed human driver must supervise FSD at all times; driver can disengage at any momentNo — zero human intervention; vehicle makes every decision autonomously
Data label qualitySupervised miles contain both successful AI decisions AND human overrides (disengagements); the override moments are the most valuable training signalFully autonomous miles: every decision the AI made was the actual outcome — no human correction signal, but the AI must be robust enough to handle every scenario without help
What the data trainsFSD: trains the neural network to match human driving behavior and recover when it would have needed intervention; disengagement events auto-label edge cases for retrainingWaymo: trains the system to handle the full distribution of real-world scenarios completely autonomously — including recovery from its own mistakes without human help
Data engine (Tesla)Tesla’s Data Engine: FSD trips are video-recorded; onboard model auto-labels frames; low-confidence frames are flagged and sent to cloud for human review; auto-labeling scales; human review focuses on hard casesWaymo’s equivalent: all driverless trips are logged; any scenario requiring remote ops attention is flagged; simulation generates synthetic edge cases
Flywheel compoundingMore Tesla vehicles → more FSD miles → more training data → better FSD → higher attach rate → more revenue → more R&D → better FSDMore Waymo driverless rides → more driverless miles → more edge cases discovered → better system → more permits → more cities → more rides

The scale ratio is striking. Tesla is adding an estimated 2–3 billion supervised miles per year; Waymo is adding an estimated 25–40 million driverless miles per year. Tesla’s raw mile generation rate exceeds Waymo’s by roughly two orders of magnitude. But the 200:1 ratio in miles does not translate to a 200:1 advantage in training signal quality — because the two mile types are training for different things.


Section 2 — What Supervised vs. Driverless Miles Actually Train Differently

Training dimensionWhat Tesla supervised miles trainWhat Waymo driverless miles trainStrategic implication
Normal drivingHuman-matched driving behavior: lane-keeping, speed, following distance, intersection handling — all calibrated to how humans driveWaymo’s own policy for normal driving: developed independently of human behavior norms; optimized for passenger comfort, safety, and efficiencyBoth are high-quality for normal driving; Tesla’s is more human-like, Waymo’s is potentially more optimal
Edge case discoveryHuman disengagement = auto-labeled edge case; 6B miles generates millions of edge-case examples at scale; edge cases are rare per mile but abundant at fleet scaleDriverless operation: Waymo encounters edge cases in the wild without a human backstop; the system must handle them; each handled edge case is a proof of robustness; remote ops attention = edge case flagTesla discovers more total edge cases (scale); Waymo’s edge cases are more valuable individually (proved handleable without human)
Recovery from mistakesHuman catches mistakes before they become incidents; the AI never learns to recover from deep failure states without human helpWaymo must recover from its own mistakes autonomously; this trains recovery behavior that supervised systems do not learnWaymo’s recovery training is a structural advantage for fully driverless operation; supervised systems have a gap here
Rare weather and conditionsTesla fleet in all 50 states plus international: snow, ice, fog, heavy rain, construction zones — broad environmental coverage at scaleWaymo fleet in 4 Sun Belt and mild-climate cities (Phoenix, SF, LA, Austin) — limited weather diversity vs. Tesla’s global fleetTesla has a significant weather diversity advantage; Waymo’s Phoenix and SF data is high-quality but narrow geographically
Night and low-light6M vehicles operate 24/7 including nights; massive night driving dataset150,000-plus weekly rides including 24/7 SF operation; but smaller fleet means less absolute night dataTesla has more absolute night miles; Waymo’s night data from SF’s 24/7 operation is high-quality urban night data
Novel scenariosAt 6B miles, Tesla has encountered millions of unusual scenarios; Data Engine surfaces the ones the model got wrongAt 30M driverless miles, Waymo encounters novel scenarios that must be handled without human help — higher bar for what counts as “handled”Different but complementary; neither has a clear winner on novel scenario training
Long tail coverageTesla: 6B miles generates enormous long-tail coverage at the cost of data quality per mile (supervised)Waymo: 30M driverless miles generates deep long-tail coverage at much higher per-mile qualityWaymo’s long-tail coverage is proven (vehicle handled it); Tesla’s is observed but with human safety net

The critical distinction is the meaning of “handled.” In Tesla’s supervised data, an edge case “handled” means a human driver was present and did not need to intervene — or did intervene, generating a disengagement. In Waymo’s driverless data, an edge case “handled” means the autonomous system navigated it to completion without any human involvement. These are different proofs of capability.


Section 3 — The Disengagement Data: Tesla’s Most Valuable Signal

Disengagement dimensionDetailNotes
What a disengagement isA human FSD user takes control of the vehicle — either voluntarily (does not like what FSD is doing) or because FSD initiated a handoffEach disengagement is an auto-labeled training example: “at this moment, in this context, the human decided FSD was not handling the situation correctly”
Disengagement rate trend (est.)FSD v12/v13 critical disengagement rate: est. 0.03–0.05 per 1,000 miles (est.); meaning est. 1 disengagement every 20,000–33,000 miles of FSD operation (est.)Every time the rate improves, there are fewer training examples per mile — but at 6B total miles, the absolute count is still massive
How Tesla uses disengagement dataDisengagement frames are auto-labeled and fed back to training pipeline; a single disengagement event generates hundreds of frames of “FSD was about to do X, human corrected” training signalThis creates a closed feedback loop that is unique to supervised driving: human behavior is the training signal
Waymo equivalentWaymo has no disengagement data (there is no human to disengage); instead, remote ops contact rates serve as a proxy for “situation the system found challenging”Remote ops contacts are rarer than Tesla disengagements and represent higher-severity situations
Why disengagement data matters for FSDTesla is training an AI to drive like a human; human corrections tell the AI exactly when and how it deviated from human judgment; this feedback loop has driven FSD from 1 disengagement per 500 miles (v10 est.) to est. 1 per 20,000-plus miles (v13 est.)The improvement trajectory suggests the feedback loop is working extremely well; FSD’s rapid improvement curve is partly explained by the quality of the disengagement training signal
The supervised-to-driverless gapTraining on supervised miles optimizes for “human would not intervene”; the transition to driverless requires “system never needs human intervention” — these are subtly but importantly different optimization targetsTesla’s FSD trained on supervised data must transfer to driverless operation; this is non-trivial and is the central challenge of Tesla’s autonomous transition

The disengagement feedback loop is Tesla’s most structurally unique training asset. No other company in the world has 6 million consumers generating real-time human corrections to AI driving decisions at scale. Each correction is a labeled training example that Waymo cannot collect — because Waymo has no human supervisors in its vehicles to generate the correction signal.


Section 4 — Fleet Scale vs. Fleet Quality: Which Flywheel Wins?

ScenarioTesla advantageWaymo advantageWho wins in this scenario
Improving normal driving performanceEnormous dataset; fast iteration cycles; billions of miles to learn fromHigh-quality autonomous data; each mile proves robustnessBoth strong; Tesla’s scale advantage is decisive for rare-but-real scenarios at scale
Achieving driverless operationMust bridge the supervised-to-driverless gap; large fleet but data optimized for supervisedAlready operating driverless; data directly reflects driverless performance requirementsWaymo: driverless training data is more directly relevant to driverless deployment
Expanding to new geographies6M vehicles already in new geographies: snow, ice, international roadsMust physically deploy fleet to new city; cannot pre-train on data it does not haveTesla: massive geographic coverage advantage for new environment pre-training
Weather robustnessSnow, ice, fog, rain data at scale (US plus international fleet)Sun Belt focus; limited severe weather dataTesla: decisive weather diversity advantage
Edge case recovery without humanSupervised data: edge cases observed but human prevents worst outcomesDriverless data: edge cases handled completely autonomouslyWaymo: recovery training without human intervention is uniquely driverless
Training efficiencyFSD v12: end-to-end model; billions of miles train one neural networkWaymo: modular systems plus end-to-end components; smaller dataset but higher per-mile information densityRoughly equal; different architectures extract different value from their respective datasets
Overall flywheel verdictTesla wins on scale, speed, and geographic coverageWaymo wins on data quality for driverless-specific capabilitiesThe winner of the long-term flywheel depends on whether the supervised-to-driverless gap can be bridged by scale (Tesla’s bet) or whether driverless-specific training data is irreplaceable (Waymo’s structural position)

The central strategic question is whether Tesla’s approach — training on supervised miles at massive scale and using end-to-end neural networks to generalize — can produce driverless-quality driving behavior without ever collecting driverless-specific training data. FSD v12 and v13’s improvement trajectory suggests the approach is working faster than many expected. But the experiment is not finished.


Section 5 — Data Flywheel Ramp Index: Key Metrics to Track

KPITesla Q2 2026Waymo Q2 2026H2 2026 trajectory
Cumulative miles (est.)est. 6B-plus supervised (est.)est. 25–35M driverless (est.)Tesla: est. plus 2–3B miles per year; Waymo: est. plus 25–40M miles per year
Weekly mile generation (est.)est. tens of millions per week (est.)est. 450,000–750,000 driverless miles per week (est.)Tesla grows with FSD attach rate; Waymo grows with fleet size
Disengagement rate (FSD est.)est. 0.03–0.05 per 1,000 miles (est.)N/A (no supervised driving)Key metric: rate continues declining → FSD approaching driverless threshold
Driverless miles per week (est.)est. 0 driverless miles (Austin supervised only)est. 450,000–750,000 driverless miles per week (est.)Tesla: first driverless miles in Austin pending FMVSS; Waymo: grows with Gen 6 fleet ramp
Model improvement rateFSD v10 → v13: disengagement rate improved est. 40x in est. 3 years (est.)Waymo: disengagement rate = 0 (not applicable); operational capability improvements tracked via new city launches and edge case handlingBoth showing rapid improvement; different metrics
Data advantage durabilityTesla’s scale advantage grows with every new FSD vehicle sold; durable as long as FSD attach rates holdWaymo’s driverless quality advantage durable as long as Tesla cannot bridge the supervised-to-driverless gap without driverless-specific training dataThe key question for both: can Tesla’s scale bridge the quality gap, or does driverless require driverless data? FSD v13’s trajectory suggests scale is winning — but the experiment is not over

Section 6 — About This Series

This is article 173 in the Physical AI Benchmark Series. Previous articles have covered the ramp index, the humanoid race, unit economics, global competition, HD mapping, fleet operations, software and OTA, insurance and liability, consumer demand, competitive moats, Cybercab versus Model Y, safety data, Waymo Gen 6, Optimus manufacturing, scorecard snapshots, the 2030 forecast scenarios, the investor framework, Waymo’s city expansion pipeline, Tesla’s state approval map, AV weather and climate constraints, the talent war, the regulatory calendar, robotaxi fare pricing, the AV data flywheel comparison, the humanoid deployment tracker, the supply chain analysis, the consumer adoption demand index, Waymo’s standalone valuation and IPO analysis, the Tesla Dojo versus cloud compute build-vs-buy analysis, the Waymo-Uber partnership analysis, the Waymo Gen 6 vehicle deep dive, the Waymo driver software architecture, Tesla Optimus factory ramp, Tesla FSD timeline history, and Waymo’s city expansion playbook.

This article adds the data flywheel dimension: the 200:1 scale ratio between Tesla’s supervised miles and Waymo’s driverless miles, what each mile type actually trains, and why the supervised-to-driverless gap is the most important open question in the Physical AI data race.

Reminder: All mileage figures, disengagement rates, fleet sizes, and projections in this article are estimates based on publicly available information, company disclosures, and industry analysis. They are labeled “(est.)” throughout. They are not investment recommendations. Conduct your own due diligence and consult a licensed financial adviser before making any investment decisions.


Sources

Tags

Tip