Skip to content
AI-Daily-Builder

2026-06-18 views

Physical AI Safety Record — Waymo 6.8x Safer Claim, Tesla FSD Incidents, NHTSA Investigations, and What the Data Proves

Waymo claims 6.8x fewer injury crashes across 30M-plus driverless miles. Tesla FSD disengagement rate falls annually. Both need more unsupervised miles.

Article 151 in the Physical AI Benchmark Series — Physical AI Safety Record: Waymo’s 6.8x Safer Claim, Tesla FSD Incident Data, NHTSA Investigations, and What the Numbers Actually Prove

Safety is the central commercial and regulatory case for autonomous vehicles. Waymo has published peer-reviewed studies claiming 6.8x fewer injury-causing crashes than human drivers. Tesla publishes quarterly safety reports on FSD miles per intervention. NHTSA has opened and closed multiple investigations into both Autopilot and FSD. Yet the public narrative around AV safety is frequently shaped by headlines that do not distinguish between at-fault and not-at-fault crashes, supervised and unsupervised driving, or statistical significance and sample size.

This article is Article 151 in the Physical AI Benchmark Series. It benchmarks the published safety data, the methodology behind each claim, what the numbers actually prove versus suggest, and the key regulatory events that have shaped the public safety narrative. All figures labeled “(est.)” are derived from public disclosures, industry research, analyst estimates, and reported data rather than independently verified primary data. This article does not constitute investment advice.


Section 1 — Waymo’s Published Safety Record

Study / metricResultMethodologyLimitation
Waymo 2023 peer-reviewed study (Nature journal partner)6.8x fewer injury-causing crashes than human drivers in comparable conditions (Waymo disclosed, published in peer-reviewed journal)Compared Waymo’s San Francisco driverless operations to human-baseline crash rates from NHTSA and CA DMV data; adjusted for exposure (miles driven, road type, time of day)Comparison baseline is aggregate human driving, not matched on identical routes; Waymo’s operational design domain (good weather, mapped urban areas) is safer than average human driving conditions
Waymo fatal crash recordZero AV-at-fault fatalities in commercial driverless operations as of mid-2026 (Waymo disclosed)Measured across 30M-plus driverless commercial miles in SF, Phoenix, LA, AustinSmall sample: 30M miles is large for AV but small for statistical significance on rare fatality events (US human average approximately 1 fatality per 100M miles)
Property damage crashesWaymo reported reduced property-damage crashes vs human baseline; 2023 study: 2.1x fewer reportable crashes overallSame peer-reviewed methodologySame operational design domain caveat
San Francisco incidents (2022-2023)Multiple Waymo vehicles involved in crashes caused by human drivers running red lights or rear-ending stationary Waymo vehicles; Waymo’s AV was not at faultMost SF incidents involved human driver errors, not Waymo AV errorsPublic perception of “Waymo crashes” does not distinguish at-fault vs not-at-fault incidents
Waymo vs Cruise comparisonCruise (GM-backed AV) had a serious pedestrian dragging incident in Oct 2023 (pedestrian knocked down by human-driven vehicle, then Cruise vehicle pulled forward, dragging the pedestrian); Cruise suspended operations subsequentlyDifferent incident; Cruise vehicle had a software decision errorCruise incident set back the AV industry broadly; Waymo has not had a comparable incident
Waymo safety report cadenceWaymo publishes annual safety reports plus incident data to CA DMV (mandatory for CA driverless permits)CA DMV incident database is public and searchableCA DMV data covers CA operations only; no comparable mandatory disclosure in other states

Reading Waymo’s Safety Numbers

The 6.8x figure is the most-cited data point in AV safety discourse, and it is important to understand both what it demonstrates and what it does not. The study was peer-reviewed and published in a Nature partner journal — this is a meaningful bar. The methodology adjusted for exposure by road type and time of day, which is more rigorous than a simple miles-driven comparison. The result is the strongest single piece of published evidence that a commercial AV operator has achieved meaningful safety advantages over aggregate human driving.

The limitation is structural, not methodological: Waymo operates in a self-selected operational design domain. The good-weather, well-mapped, urban San Francisco environment is inherently safer than the full range of human driving conditions (ice, fog, rural roads, impaired drivers, school zones). A more conservative reading of the 6.8x figure is that Waymo is significantly safer than the average human driver driving under a much wider range of conditions — which is meaningful, but not the same as being 6.8x safer than a human driving under identical conditions.


Section 2 — Tesla FSD Safety Data

MetricResultMethodologyLimitation
Tesla quarterly safety reportTesla publishes quarterly: miles per critical disengagement (driver-initiated); miles per ADAS interventionSupervised FSD: every driver takeover is logged; Tesla reports aggregate fleet statisticsSupervised only: safety driver present in all reported miles; does not reflect unsupervised driverless safety
Tesla Q1 2026 safety report (est.)Est. approximately 1 critical disengagement per 30,000-50,000 miles (est. based on disclosed trend)Disengagement = driver takeover due to safety concern; does not include routine driver preference takeoversDisengagement rate is an indirect safety proxy; a low rate means drivers rarely felt unsafe, not that crashes would not occur without driver
FSD-related crashes (NHTSA SCI)NHTSA Standing General Order: Tesla has reported FSD-related crashes; NHTSA has opened multiple investigationsNHTSA SCI (Special Crash Investigation) program tracks advanced driver assistance system crashesAll reported crashes involved supervised FSD; safety driver present; most resulted in minor damage
Tesla vs Autopilot fatalitiesTesla has reported Autopilot-active fatalities; these involve Autopilot (lane-keeping assist), not FSD (the more advanced product)Tesla reports Autopilot crash data as required by NHTSAAutopilot and FSD are distinct systems; Autopilot is less capable; conflating them overstates FSD risk
NHTSA FSD investigationsNHTSA opened investigations into FSD behavior in specific scenarios (emergency vehicle detection, sun glare, toll booths); several closed without recall; some led to OTA software updatesInvestigation does not imply defect; OTA resolution is industry-leading for scopeMultiple open investigations at any time creates ongoing regulatory uncertainty
Tesla Robotaxi (unsupervised) safety recordAustin Robotaxi launch 2026: limited data; no fatalities disclosed; small sample size (tens of vehicles, weeks of operation)Unsupervised commercial rides; no safety driver; small sampleToo early to draw statistical conclusions; most meaningful comparison point will emerge in 12-24 months

Understanding the Supervised vs Driverless Distinction

Tesla’s quarterly safety reports are among the most detailed in the industry, and the declining disengagement rate is a genuine positive signal about supervised FSD maturity. However, the critical distinction for safety benchmarking is that supervised FSD data — where a licensed driver is present and legally required to monitor the system — is not a proxy for unsupervised driverless safety.

The reason is not merely technical but statistical: a supervised driver intervening before an accident prevents data on what the AV system would have done without intervention. The disengagement data tells us how often drivers felt the need to take over, which correlates with safety but does not equal it. Tesla’s Austin Robotaxi launch in 2026 represents the first real-world accumulation of unsupervised safety data — but weeks of data across tens of vehicles is too small a sample for meaningful comparison to Waymo’s 30M-plus miles.


Section 3 — NHTSA Regulatory Events Timeline

DateEventOutcomeImpact
2021NHTSA opens Autopilot investigation into crashes with emergency vehiclesIdentified pattern; Tesla deployed OTA update to address emergency vehicle detectionOTA resolution; investigation closed; precedent set for NHTSA OTA resolution
2022NHTSA expands Autopilot investigation to 830,000 vehiclesTesla recalls Autopilot software via OTA update; NHTSA closes investigationLargest “recall” by vehicle count in history; all OTA; no physical action required by owners
2023NHTSA investigates FSD beta phantom braking (unexpected deceleration)Tesla OTA update; investigation closedPhantom braking = significant safety concern in real traffic; addressed by software update
2023Cruise pedestrian dragging incident (NOT Tesla/Waymo)Cruise suspended CA driverless operations; CA DMV revoked Cruise permitDamaged entire AV industry’s public trust; accelerated NHTSA scrutiny of all AV operators
2024NHTSA investigates FSD behavior in sun glare conditionsInvestigation ongoing as of mid-2026 (est.)Represents a known hard sensor case for camera-only systems
2024-2025NHTSA monitors Tesla Austin Robotaxi launch under Standing General OrderTesla required to report incidents; standard for all AV commercial operatorsNo recall actions in Austin Robotaxi as of mid-2026 (est.)
2026Waymo CA DMV annual incident reportContinued pattern of low at-fault incidents; human-caused crashes dominate incident reportsSupports Waymo’s safety case; annual public disclosure maintains accountability

The NHTSA Investigation Lifecycle

NHTSA investigations are frequently reported as safety failures, but the investigation lifecycle tells a more nuanced story. An NHTSA investigation is opened based on a pattern of reported incidents — it does not imply that a defect has been found or that the system is unsafe. The majority of Tesla AV-related investigations have closed with OTA software updates rather than physical recalls, which is a structurally positive outcome: the safety issue was identified, addressed, and deployed to the entire fleet without owners needing to visit a dealer.

The 2022 Autopilot recall — 830,000 vehicles addressed via OTA — was the largest by vehicle count in automotive history. The fact that it required no physical action from owners and was completed in days rather than years represents a new model for automotive safety remediation. This OTA resolution model is both a Tesla competitive advantage and a template that regulators are still developing frameworks around.

The Cruise October 2023 incident deserves separate treatment because it is frequently cited alongside Waymo and Tesla incidents without noting that Cruise is a different company and a different system. The Cruise incident — in which a pedestrian who had already been struck by a human driver was subsequently dragged by a Cruise AV that failed to recognize the pedestrian under the vehicle — was a genuine software failure with serious consequences. Cruise suspended operations and the CA DMV revoked its permit. This incident had no equivalent at Waymo or Tesla, but it materially damaged public and regulatory confidence in the AV sector broadly.


Section 4 — How to Read AV Safety Statistics: Methodology Matters

IssueWhat it meansWhy it matters
Operational design domain biasWaymo operates in mapped urban areas in good weather; human baseline includes all conditions (ice, fog, rural, impaired drivers)Waymo’s operational domain is inherently safer than the average human driving environment; 6.8x may be overstated if compared to similar-condition human drivers
Sample size for rare eventsAt 30M miles, Waymo expects approximately 0.3 at-fault fatalities statistically (at human average 1 per 100M); zero observed is consistent with 1 per 100M, not proof of being safer by a lotAbsence of a rare event in a small sample is weak evidence of being safer by a large margin; need 1B-plus miles for statistical confidence on fatalities
Supervised vs driverless comparisonTesla’s safety data is supervised FSD (driver present); Waymo’s driverless data has no driver; comparing the two directly is a category errorThe safety question for robotaxi is unsupervised driverless, not supervised FSD; Tesla’s supervised data is not a proxy for its driverless safety
At-fault vs any crashMany “AV crashes” reported in media are crashes where a human driver hit the AV; at-fault rate is more meaningful than incident ratePublic reporting often does not distinguish; Waymo is frequently hit by human drivers in dense urban areas
Disengagement as safety proxyLow disengagement rate = drivers rarely felt unsafe; does not directly measure crash probability without a driverThe critical question is: what would have happened if no driver were present? Disengagement data cannot answer this directly
Publication biasCompanies publish favorable data; unfavorable incidents are disclosed under regulatory requirement (NHTSA, CA DMV), not voluntarilySafety reports should be read alongside mandatory regulatory disclosures, not as standalone marketing

The Statistical Honesty Required for Fair AV Safety Assessment

The most important methodological point in AV safety benchmarking is the difference between absence of evidence and evidence of absence. Waymo’s zero at-fault fatalities in 30M-plus commercial driverless miles is a meaningful positive result, but it is not statistical proof that Waymo is dramatically safer than human drivers on fatalities. At the US human baseline of approximately 1 fatality per 100M miles, Waymo’s 30M-mile fleet would statistically expect 0.3 fatalities — and zero observed is consistent with that expectation. The result would be more statistically meaningful at 1B miles, and decisive at 10B miles.

This is not a criticism of Waymo’s safety record — zero at-fault fatalities in commercial driverless operations is an extraordinary achievement. The point is epistemic: extraordinary claims (6.8x safer, zero fatalities) require the statistical sample to support the confidence level of the claim. The AV industry is still in the data-accumulation phase, and honest safety benchmarking requires acknowledging this limitation alongside celebrating the genuine progress.


Section 5 — Safety Benchmark Scorecard

DimensionWaymoTesla FSDEvidence qualityNotes
At-fault fatal crash rateZero in 30M-plus commercial driverless miles (Waymo disclosed)Not applicable (supervised; safety driver present)Moderate — sample too small for statistical confidenceWaymo’s zero fatalities is significant but expected at this mileage; not proof of human-level safety
Peer-reviewed safety study6.8x fewer injury crashes (2023, peer-reviewed, Waymo-funded)No equivalent peer-reviewed driverless studyModerate — methodology has operational design domain caveatBest available independent analysis of Waymo safety; caveats important
Regulatory record (NHTSA)No recall actions; CA DMV permit maintained continuouslyMultiple investigations; OTA resolutions; no hard recallsEven — different systems and productsNHTSA investigations are routine for any technology at scale; OTA resolution is positive outcome
Unsupervised commercial safety data4-plus years, 30M-plus miles across 4 citiesWeeks, tens of vehicles, Austin only (2026)Waymo decisive (volume)Tesla’s sample will grow; meaningful comparison in 2027-2028
Trend directionDeclining incident rate as markets mature (Phoenix best)Declining disengagement rate annually (est.)Both positiveBoth improving; Waymo’s improvement is in commercial driverless context

Overall Verdict

Waymo has the most rigorous published AV safety record in the world: zero at-fault fatalities in 30M-plus driverless commercial miles, a peer-reviewed 6.8x safety advantage study (with methodology caveats), and 4-plus years of incident disclosure to CA DMV. Tesla FSD has a strong supervised safety record with declining disengagement rates, but its unsupervised commercial safety record is 2026-early-stage with insufficient data for comparison.

The honest conclusion: Waymo has demonstrated meaningful safety in its operational design domain; Tesla has demonstrated improving supervised safety; both need more unsupervised driverless miles to make statistically robust claims about human-level safety parity. The next 24 months — as Tesla’s Austin Robotaxi operation accumulates commercial driverless miles — will be the most important period yet for closing the comparison gap.


Note: All figures labeled “(est.)” are derived from public disclosures, industry research, analyst estimates, and reported data as of mid-2026. This article does not constitute investment advice.


Sources

Tags

Tip