2026-06-18 — views
Physical AI Safety Record — Waymo 6.8x Safer Claim, Tesla FSD Incidents, NHTSA Investigations, and What the Data Proves
Waymo claims 6.8x fewer injury crashes across 30M-plus driverless miles. Tesla FSD disengagement rate falls annually. Both need more unsupervised miles.
Article 151 in the Physical AI Benchmark Series — Physical AI Safety Record: Waymo’s 6.8x Safer Claim, Tesla FSD Incident Data, NHTSA Investigations, and What the Numbers Actually Prove
Safety is the central commercial and regulatory case for autonomous vehicles. Waymo has published peer-reviewed studies claiming 6.8x fewer injury-causing crashes than human drivers. Tesla publishes quarterly safety reports on FSD miles per intervention. NHTSA has opened and closed multiple investigations into both Autopilot and FSD. Yet the public narrative around AV safety is frequently shaped by headlines that do not distinguish between at-fault and not-at-fault crashes, supervised and unsupervised driving, or statistical significance and sample size.
This article is Article 151 in the Physical AI Benchmark Series. It benchmarks the published safety data, the methodology behind each claim, what the numbers actually prove versus suggest, and the key regulatory events that have shaped the public safety narrative. All figures labeled “(est.)” are derived from public disclosures, industry research, analyst estimates, and reported data rather than independently verified primary data. This article does not constitute investment advice.
Section 1 — Waymo’s Published Safety Record
| Study / metric | Result | Methodology | Limitation |
|---|---|---|---|
| Waymo 2023 peer-reviewed study (Nature journal partner) | 6.8x fewer injury-causing crashes than human drivers in comparable conditions (Waymo disclosed, published in peer-reviewed journal) | Compared Waymo’s San Francisco driverless operations to human-baseline crash rates from NHTSA and CA DMV data; adjusted for exposure (miles driven, road type, time of day) | Comparison baseline is aggregate human driving, not matched on identical routes; Waymo’s operational design domain (good weather, mapped urban areas) is safer than average human driving conditions |
| Waymo fatal crash record | Zero AV-at-fault fatalities in commercial driverless operations as of mid-2026 (Waymo disclosed) | Measured across 30M-plus driverless commercial miles in SF, Phoenix, LA, Austin | Small sample: 30M miles is large for AV but small for statistical significance on rare fatality events (US human average approximately 1 fatality per 100M miles) |
| Property damage crashes | Waymo reported reduced property-damage crashes vs human baseline; 2023 study: 2.1x fewer reportable crashes overall | Same peer-reviewed methodology | Same operational design domain caveat |
| San Francisco incidents (2022-2023) | Multiple Waymo vehicles involved in crashes caused by human drivers running red lights or rear-ending stationary Waymo vehicles; Waymo’s AV was not at fault | Most SF incidents involved human driver errors, not Waymo AV errors | Public perception of “Waymo crashes” does not distinguish at-fault vs not-at-fault incidents |
| Waymo vs Cruise comparison | Cruise (GM-backed AV) had a serious pedestrian dragging incident in Oct 2023 (pedestrian knocked down by human-driven vehicle, then Cruise vehicle pulled forward, dragging the pedestrian); Cruise suspended operations subsequently | Different incident; Cruise vehicle had a software decision error | Cruise incident set back the AV industry broadly; Waymo has not had a comparable incident |
| Waymo safety report cadence | Waymo publishes annual safety reports plus incident data to CA DMV (mandatory for CA driverless permits) | CA DMV incident database is public and searchable | CA DMV data covers CA operations only; no comparable mandatory disclosure in other states |
Reading Waymo’s Safety Numbers
The 6.8x figure is the most-cited data point in AV safety discourse, and it is important to understand both what it demonstrates and what it does not. The study was peer-reviewed and published in a Nature partner journal — this is a meaningful bar. The methodology adjusted for exposure by road type and time of day, which is more rigorous than a simple miles-driven comparison. The result is the strongest single piece of published evidence that a commercial AV operator has achieved meaningful safety advantages over aggregate human driving.
The limitation is structural, not methodological: Waymo operates in a self-selected operational design domain. The good-weather, well-mapped, urban San Francisco environment is inherently safer than the full range of human driving conditions (ice, fog, rural roads, impaired drivers, school zones). A more conservative reading of the 6.8x figure is that Waymo is significantly safer than the average human driver driving under a much wider range of conditions — which is meaningful, but not the same as being 6.8x safer than a human driving under identical conditions.
Section 2 — Tesla FSD Safety Data
| Metric | Result | Methodology | Limitation |
|---|---|---|---|
| Tesla quarterly safety report | Tesla publishes quarterly: miles per critical disengagement (driver-initiated); miles per ADAS intervention | Supervised FSD: every driver takeover is logged; Tesla reports aggregate fleet statistics | Supervised only: safety driver present in all reported miles; does not reflect unsupervised driverless safety |
| Tesla Q1 2026 safety report (est.) | Est. approximately 1 critical disengagement per 30,000-50,000 miles (est. based on disclosed trend) | Disengagement = driver takeover due to safety concern; does not include routine driver preference takeovers | Disengagement rate is an indirect safety proxy; a low rate means drivers rarely felt unsafe, not that crashes would not occur without driver |
| FSD-related crashes (NHTSA SCI) | NHTSA Standing General Order: Tesla has reported FSD-related crashes; NHTSA has opened multiple investigations | NHTSA SCI (Special Crash Investigation) program tracks advanced driver assistance system crashes | All reported crashes involved supervised FSD; safety driver present; most resulted in minor damage |
| Tesla vs Autopilot fatalities | Tesla has reported Autopilot-active fatalities; these involve Autopilot (lane-keeping assist), not FSD (the more advanced product) | Tesla reports Autopilot crash data as required by NHTSA | Autopilot and FSD are distinct systems; Autopilot is less capable; conflating them overstates FSD risk |
| NHTSA FSD investigations | NHTSA opened investigations into FSD behavior in specific scenarios (emergency vehicle detection, sun glare, toll booths); several closed without recall; some led to OTA software updates | Investigation does not imply defect; OTA resolution is industry-leading for scope | Multiple open investigations at any time creates ongoing regulatory uncertainty |
| Tesla Robotaxi (unsupervised) safety record | Austin Robotaxi launch 2026: limited data; no fatalities disclosed; small sample size (tens of vehicles, weeks of operation) | Unsupervised commercial rides; no safety driver; small sample | Too early to draw statistical conclusions; most meaningful comparison point will emerge in 12-24 months |
Understanding the Supervised vs Driverless Distinction
Tesla’s quarterly safety reports are among the most detailed in the industry, and the declining disengagement rate is a genuine positive signal about supervised FSD maturity. However, the critical distinction for safety benchmarking is that supervised FSD data — where a licensed driver is present and legally required to monitor the system — is not a proxy for unsupervised driverless safety.
The reason is not merely technical but statistical: a supervised driver intervening before an accident prevents data on what the AV system would have done without intervention. The disengagement data tells us how often drivers felt the need to take over, which correlates with safety but does not equal it. Tesla’s Austin Robotaxi launch in 2026 represents the first real-world accumulation of unsupervised safety data — but weeks of data across tens of vehicles is too small a sample for meaningful comparison to Waymo’s 30M-plus miles.
Section 3 — NHTSA Regulatory Events Timeline
| Date | Event | Outcome | Impact |
|---|---|---|---|
| 2021 | NHTSA opens Autopilot investigation into crashes with emergency vehicles | Identified pattern; Tesla deployed OTA update to address emergency vehicle detection | OTA resolution; investigation closed; precedent set for NHTSA OTA resolution |
| 2022 | NHTSA expands Autopilot investigation to 830,000 vehicles | Tesla recalls Autopilot software via OTA update; NHTSA closes investigation | Largest “recall” by vehicle count in history; all OTA; no physical action required by owners |
| 2023 | NHTSA investigates FSD beta phantom braking (unexpected deceleration) | Tesla OTA update; investigation closed | Phantom braking = significant safety concern in real traffic; addressed by software update |
| 2023 | Cruise pedestrian dragging incident (NOT Tesla/Waymo) | Cruise suspended CA driverless operations; CA DMV revoked Cruise permit | Damaged entire AV industry’s public trust; accelerated NHTSA scrutiny of all AV operators |
| 2024 | NHTSA investigates FSD behavior in sun glare conditions | Investigation ongoing as of mid-2026 (est.) | Represents a known hard sensor case for camera-only systems |
| 2024-2025 | NHTSA monitors Tesla Austin Robotaxi launch under Standing General Order | Tesla required to report incidents; standard for all AV commercial operators | No recall actions in Austin Robotaxi as of mid-2026 (est.) |
| 2026 | Waymo CA DMV annual incident report | Continued pattern of low at-fault incidents; human-caused crashes dominate incident reports | Supports Waymo’s safety case; annual public disclosure maintains accountability |
The NHTSA Investigation Lifecycle
NHTSA investigations are frequently reported as safety failures, but the investigation lifecycle tells a more nuanced story. An NHTSA investigation is opened based on a pattern of reported incidents — it does not imply that a defect has been found or that the system is unsafe. The majority of Tesla AV-related investigations have closed with OTA software updates rather than physical recalls, which is a structurally positive outcome: the safety issue was identified, addressed, and deployed to the entire fleet without owners needing to visit a dealer.
The 2022 Autopilot recall — 830,000 vehicles addressed via OTA — was the largest by vehicle count in automotive history. The fact that it required no physical action from owners and was completed in days rather than years represents a new model for automotive safety remediation. This OTA resolution model is both a Tesla competitive advantage and a template that regulators are still developing frameworks around.
The Cruise October 2023 incident deserves separate treatment because it is frequently cited alongside Waymo and Tesla incidents without noting that Cruise is a different company and a different system. The Cruise incident — in which a pedestrian who had already been struck by a human driver was subsequently dragged by a Cruise AV that failed to recognize the pedestrian under the vehicle — was a genuine software failure with serious consequences. Cruise suspended operations and the CA DMV revoked its permit. This incident had no equivalent at Waymo or Tesla, but it materially damaged public and regulatory confidence in the AV sector broadly.
Section 4 — How to Read AV Safety Statistics: Methodology Matters
| Issue | What it means | Why it matters |
|---|---|---|
| Operational design domain bias | Waymo operates in mapped urban areas in good weather; human baseline includes all conditions (ice, fog, rural, impaired drivers) | Waymo’s operational domain is inherently safer than the average human driving environment; 6.8x may be overstated if compared to similar-condition human drivers |
| Sample size for rare events | At 30M miles, Waymo expects approximately 0.3 at-fault fatalities statistically (at human average 1 per 100M); zero observed is consistent with 1 per 100M, not proof of being safer by a lot | Absence of a rare event in a small sample is weak evidence of being safer by a large margin; need 1B-plus miles for statistical confidence on fatalities |
| Supervised vs driverless comparison | Tesla’s safety data is supervised FSD (driver present); Waymo’s driverless data has no driver; comparing the two directly is a category error | The safety question for robotaxi is unsupervised driverless, not supervised FSD; Tesla’s supervised data is not a proxy for its driverless safety |
| At-fault vs any crash | Many “AV crashes” reported in media are crashes where a human driver hit the AV; at-fault rate is more meaningful than incident rate | Public reporting often does not distinguish; Waymo is frequently hit by human drivers in dense urban areas |
| Disengagement as safety proxy | Low disengagement rate = drivers rarely felt unsafe; does not directly measure crash probability without a driver | The critical question is: what would have happened if no driver were present? Disengagement data cannot answer this directly |
| Publication bias | Companies publish favorable data; unfavorable incidents are disclosed under regulatory requirement (NHTSA, CA DMV), not voluntarily | Safety reports should be read alongside mandatory regulatory disclosures, not as standalone marketing |
The Statistical Honesty Required for Fair AV Safety Assessment
The most important methodological point in AV safety benchmarking is the difference between absence of evidence and evidence of absence. Waymo’s zero at-fault fatalities in 30M-plus commercial driverless miles is a meaningful positive result, but it is not statistical proof that Waymo is dramatically safer than human drivers on fatalities. At the US human baseline of approximately 1 fatality per 100M miles, Waymo’s 30M-mile fleet would statistically expect 0.3 fatalities — and zero observed is consistent with that expectation. The result would be more statistically meaningful at 1B miles, and decisive at 10B miles.
This is not a criticism of Waymo’s safety record — zero at-fault fatalities in commercial driverless operations is an extraordinary achievement. The point is epistemic: extraordinary claims (6.8x safer, zero fatalities) require the statistical sample to support the confidence level of the claim. The AV industry is still in the data-accumulation phase, and honest safety benchmarking requires acknowledging this limitation alongside celebrating the genuine progress.
Section 5 — Safety Benchmark Scorecard
| Dimension | Waymo | Tesla FSD | Evidence quality | Notes |
|---|---|---|---|---|
| At-fault fatal crash rate | Zero in 30M-plus commercial driverless miles (Waymo disclosed) | Not applicable (supervised; safety driver present) | Moderate — sample too small for statistical confidence | Waymo’s zero fatalities is significant but expected at this mileage; not proof of human-level safety |
| Peer-reviewed safety study | 6.8x fewer injury crashes (2023, peer-reviewed, Waymo-funded) | No equivalent peer-reviewed driverless study | Moderate — methodology has operational design domain caveat | Best available independent analysis of Waymo safety; caveats important |
| Regulatory record (NHTSA) | No recall actions; CA DMV permit maintained continuously | Multiple investigations; OTA resolutions; no hard recalls | Even — different systems and products | NHTSA investigations are routine for any technology at scale; OTA resolution is positive outcome |
| Unsupervised commercial safety data | 4-plus years, 30M-plus miles across 4 cities | Weeks, tens of vehicles, Austin only (2026) | Waymo decisive (volume) | Tesla’s sample will grow; meaningful comparison in 2027-2028 |
| Trend direction | Declining incident rate as markets mature (Phoenix best) | Declining disengagement rate annually (est.) | Both positive | Both improving; Waymo’s improvement is in commercial driverless context |
Overall Verdict
Waymo has the most rigorous published AV safety record in the world: zero at-fault fatalities in 30M-plus driverless commercial miles, a peer-reviewed 6.8x safety advantage study (with methodology caveats), and 4-plus years of incident disclosure to CA DMV. Tesla FSD has a strong supervised safety record with declining disengagement rates, but its unsupervised commercial safety record is 2026-early-stage with insufficient data for comparison.
The honest conclusion: Waymo has demonstrated meaningful safety in its operational design domain; Tesla has demonstrated improving supervised safety; both need more unsupervised driverless miles to make statistically robust claims about human-level safety parity. The next 24 months — as Tesla’s Austin Robotaxi operation accumulates commercial driverless miles — will be the most important period yet for closing the comparison gap.
Note: All figures labeled “(est.)” are derived from public disclosures, industry research, analyst estimates, and reported data as of mid-2026. This article does not constitute investment advice.
Sources
- Waymo safety study — peer-reviewed, Nature partner journal ↗
- Tesla quarterly vehicle safety report — Tesla ↗
- NHTSA AV incident database — NHTSA ↗
- California DMV AV incident reports — CA DMV ↗
- Cruise pedestrian dragging incident — NHTSA investigation ↗