2026-06-18 — views

Physical AI Safety Record — Waymo 6.8x Safer Claim, Tesla FSD Incidents, NHTSA Investigations, and What the Data Proves

Waymo claims 6.8x fewer injury crashes across 30M-plus driverless miles. Tesla FSD disengagement rate falls annually. Both need more unsupervised miles.

Article 151 in the Physical AI Benchmark Series — Physical AI Safety Record: Waymo’s 6.8x Safer Claim, Tesla FSD Incident Data, NHTSA Investigations, and What the Numbers Actually Prove

Safety is the central commercial and regulatory case for autonomous vehicles. Waymo has published peer-reviewed studies claiming 6.8x fewer injury-causing crashes than human drivers. Tesla publishes quarterly safety reports on FSD miles per intervention. NHTSA has opened and closed multiple investigations into both Autopilot and FSD. Yet the public narrative around AV safety is frequently shaped by headlines that do not distinguish between at-fault and not-at-fault crashes, supervised and unsupervised driving, or statistical significance and sample size.

This article is Article 151 in the Physical AI Benchmark Series. It benchmarks the published safety data, the methodology behind each claim, what the numbers actually prove versus suggest, and the key regulatory events that have shaped the public safety narrative. All figures labeled “(est.)” are derived from public disclosures, industry research, analyst estimates, and reported data rather than independently verified primary data. This article does not constitute investment advice.

Section 1 — Waymo’s Published Safety Record

Study / metric	Result	Methodology	Limitation
Waymo 2023 peer-reviewed study (Nature journal partner)	6.8x fewer injury-causing crashes than human drivers in comparable conditions (Waymo disclosed, published in peer-reviewed journal)	Compared Waymo’s San Francisco driverless operations to human-baseline crash rates from NHTSA and CA DMV data; adjusted for exposure (miles driven, road type, time of day)	Comparison baseline is aggregate human driving, not matched on identical routes; Waymo’s operational design domain (good weather, mapped urban areas) is safer than average human driving conditions
Waymo fatal crash record	Zero AV-at-fault fatalities in commercial driverless operations as of mid-2026 (Waymo disclosed)	Measured across 30M-plus driverless commercial miles in SF, Phoenix, LA, Austin	Small sample: 30M miles is large for AV but small for statistical significance on rare fatality events (US human average approximately 1 fatality per 100M miles)
Property damage crashes	Waymo reported reduced property-damage crashes vs human baseline; 2023 study: 2.1x fewer reportable crashes overall	Same peer-reviewed methodology	Same operational design domain caveat
San Francisco incidents (2022-2023)	Multiple Waymo vehicles involved in crashes caused by human drivers running red lights or rear-ending stationary Waymo vehicles; Waymo’s AV was not at fault	Most SF incidents involved human driver errors, not Waymo AV errors	Public perception of “Waymo crashes” does not distinguish at-fault vs not-at-fault incidents
Waymo vs Cruise comparison	Cruise (GM-backed AV) had a serious pedestrian dragging incident in Oct 2023 (pedestrian knocked down by human-driven vehicle, then Cruise vehicle pulled forward, dragging the pedestrian); Cruise suspended operations subsequently	Different incident; Cruise vehicle had a software decision error	Cruise incident set back the AV industry broadly; Waymo has not had a comparable incident
Waymo safety report cadence	Waymo publishes annual safety reports plus incident data to CA DMV (mandatory for CA driverless permits)	CA DMV incident database is public and searchable	CA DMV data covers CA operations only; no comparable mandatory disclosure in other states

Reading Waymo’s Safety Numbers

The 6.8x figure is the most-cited data point in AV safety discourse, and it is important to understand both what it demonstrates and what it does not. The study was peer-reviewed and published in a Nature partner journal — this is a meaningful bar. The methodology adjusted for exposure by road type and time of day, which is more rigorous than a simple miles-driven comparison. The result is the strongest single piece of published evidence that a commercial AV operator has achieved meaningful safety advantages over aggregate human driving.

The limitation is structural, not methodological: Waymo operates in a self-selected operational design domain. The good-weather, well-mapped, urban San Francisco environment is inherently safer than the full range of human driving conditions (ice, fog, rural roads, impaired drivers, school zones). A more conservative reading of the 6.8x figure is that Waymo is significantly safer than the average human driver driving under a much wider range of conditions — which is meaningful, but not the same as being 6.8x safer than a human driving under identical conditions.

Section 2 — Tesla FSD Safety Data

Metric	Result	Methodology	Limitation
Tesla quarterly safety report	Tesla publishes quarterly: miles per critical disengagement (driver-initiated); miles per ADAS intervention	Supervised FSD: every driver takeover is logged; Tesla reports aggregate fleet statistics	Supervised only: safety driver present in all reported miles; does not reflect unsupervised driverless safety
Tesla Q1 2026 safety report (est.)	Est. approximately 1 critical disengagement per 30,000-50,000 miles (est. based on disclosed trend)	Disengagement = driver takeover due to safety concern; does not include routine driver preference takeovers	Disengagement rate is an indirect safety proxy; a low rate means drivers rarely felt unsafe, not that crashes would not occur without driver
FSD-related crashes (NHTSA SCI)	NHTSA Standing General Order: Tesla has reported FSD-related crashes; NHTSA has opened multiple investigations	NHTSA SCI (Special Crash Investigation) program tracks advanced driver assistance system crashes	All reported crashes involved supervised FSD; safety driver present; most resulted in minor damage
Tesla vs Autopilot fatalities	Tesla has reported Autopilot-active fatalities; these involve Autopilot (lane-keeping assist), not FSD (the more advanced product)	Tesla reports Autopilot crash data as required by NHTSA	Autopilot and FSD are distinct systems; Autopilot is less capable; conflating them overstates FSD risk
NHTSA FSD investigations	NHTSA opened investigations into FSD behavior in specific scenarios (emergency vehicle detection, sun glare, toll booths); several closed without recall; some led to OTA software updates	Investigation does not imply defect; OTA resolution is industry-leading for scope	Multiple open investigations at any time creates ongoing regulatory uncertainty
Tesla Robotaxi (unsupervised) safety record	Austin Robotaxi launch 2026: limited data; no fatalities disclosed; small sample size (tens of vehicles, weeks of operation)	Unsupervised commercial rides; no safety driver; small sample	Too early to draw statistical conclusions; most meaningful comparison point will emerge in 12-24 months

Understanding the Supervised vs Driverless Distinction

Tesla’s quarterly safety reports are among the most detailed in the industry, and the declining disengagement rate is a genuine positive signal about supervised FSD maturity. However, the critical distinction for safety benchmarking is that supervised FSD data — where a licensed driver is present and legally required to monitor the system — is not a proxy for unsupervised driverless safety.

The reason is not merely technical but statistical: a supervised driver intervening before an accident prevents data on what the AV system would have done without intervention. The disengagement data tells us how often drivers felt the need to take over, which correlates with safety but does not equal it. Tesla’s Austin Robotaxi launch in 2026 represents the first real-world accumulation of unsupervised safety data — but weeks of data across tens of vehicles is too small a sample for meaningful comparison to Waymo’s 30M-plus miles.

Section 3 — NHTSA Regulatory Events Timeline

Date	Event	Outcome	Impact
2021	NHTSA opens Autopilot investigation into crashes with emergency vehicles	Identified pattern; Tesla deployed OTA update to address emergency vehicle detection	OTA resolution; investigation closed; precedent set for NHTSA OTA resolution
2022	NHTSA expands Autopilot investigation to 830,000 vehicles	Tesla recalls Autopilot software via OTA update; NHTSA closes investigation	Largest “recall” by vehicle count in history; all OTA; no physical action required by owners
2023	NHTSA investigates FSD beta phantom braking (unexpected deceleration)	Tesla OTA update; investigation closed	Phantom braking = significant safety concern in real traffic; addressed by software update
2023	Cruise pedestrian dragging incident (NOT Tesla/Waymo)	Cruise suspended CA driverless operations; CA DMV revoked Cruise permit	Damaged entire AV industry’s public trust; accelerated NHTSA scrutiny of all AV operators
2024	NHTSA investigates FSD behavior in sun glare conditions	Investigation ongoing as of mid-2026 (est.)	Represents a known hard sensor case for camera-only systems
2024-2025	NHTSA monitors Tesla Austin Robotaxi launch under Standing General Order	Tesla required to report incidents; standard for all AV commercial operators	No recall actions in Austin Robotaxi as of mid-2026 (est.)
2026	Waymo CA DMV annual incident report	Continued pattern of low at-fault incidents; human-caused crashes dominate incident reports	Supports Waymo’s safety case; annual public disclosure maintains accountability

The NHTSA Investigation Lifecycle

NHTSA investigations are frequently reported as safety failures, but the investigation lifecycle tells a more nuanced story. An NHTSA investigation is opened based on a pattern of reported incidents — it does not imply that a defect has been found or that the system is unsafe. The majority of Tesla AV-related investigations have closed with OTA software updates rather than physical recalls, which is a structurally positive outcome: the safety issue was identified, addressed, and deployed to the entire fleet without owners needing to visit a dealer.

The 2022 Autopilot recall — 830,000 vehicles addressed via OTA — was the largest by vehicle count in automotive history. The fact that it required no physical action from owners and was completed in days rather than years represents a new model for automotive safety remediation. This OTA resolution model is both a Tesla competitive advantage and a template that regulators are still developing frameworks around.

The Cruise October 2023 incident deserves separate treatment because it is frequently cited alongside Waymo and Tesla incidents without noting that Cruise is a different company and a different system. The Cruise incident — in which a pedestrian who had already been struck by a human driver was subsequently dragged by a Cruise AV that failed to recognize the pedestrian under the vehicle — was a genuine software failure with serious consequences. Cruise suspended operations and the CA DMV revoked its permit. This incident had no equivalent at Waymo or Tesla, but it materially damaged public and regulatory confidence in the AV sector broadly.

Section 4 — How to Read AV Safety Statistics: Methodology Matters

Issue	What it means	Why it matters
Operational design domain bias	Waymo operates in mapped urban areas in good weather; human baseline includes all conditions (ice, fog, rural, impaired drivers)	Waymo’s operational domain is inherently safer than the average human driving environment; 6.8x may be overstated if compared to similar-condition human drivers
Sample size for rare events	At 30M miles, Waymo expects approximately 0.3 at-fault fatalities statistically (at human average 1 per 100M); zero observed is consistent with 1 per 100M, not proof of being safer by a lot	Absence of a rare event in a small sample is weak evidence of being safer by a large margin; need 1B-plus miles for statistical confidence on fatalities
Supervised vs driverless comparison	Tesla’s safety data is supervised FSD (driver present); Waymo’s driverless data has no driver; comparing the two directly is a category error	The safety question for robotaxi is unsupervised driverless, not supervised FSD; Tesla’s supervised data is not a proxy for its driverless safety
At-fault vs any crash	Many “AV crashes” reported in media are crashes where a human driver hit the AV; at-fault rate is more meaningful than incident rate	Public reporting often does not distinguish; Waymo is frequently hit by human drivers in dense urban areas
Disengagement as safety proxy	Low disengagement rate = drivers rarely felt unsafe; does not directly measure crash probability without a driver	The critical question is: what would have happened if no driver were present? Disengagement data cannot answer this directly
Publication bias	Companies publish favorable data; unfavorable incidents are disclosed under regulatory requirement (NHTSA, CA DMV), not voluntarily	Safety reports should be read alongside mandatory regulatory disclosures, not as standalone marketing

The Statistical Honesty Required for Fair AV Safety Assessment

The most important methodological point in AV safety benchmarking is the difference between absence of evidence and evidence of absence. Waymo’s zero at-fault fatalities in 30M-plus commercial driverless miles is a meaningful positive result, but it is not statistical proof that Waymo is dramatically safer than human drivers on fatalities. At the US human baseline of approximately 1 fatality per 100M miles, Waymo’s 30M-mile fleet would statistically expect 0.3 fatalities — and zero observed is consistent with that expectation. The result would be more statistically meaningful at 1B miles, and decisive at 10B miles.

This is not a criticism of Waymo’s safety record — zero at-fault fatalities in commercial driverless operations is an extraordinary achievement. The point is epistemic: extraordinary claims (6.8x safer, zero fatalities) require the statistical sample to support the confidence level of the claim. The AV industry is still in the data-accumulation phase, and honest safety benchmarking requires acknowledging this limitation alongside celebrating the genuine progress.

Section 5 — Safety Benchmark Scorecard

Dimension	Waymo	Tesla FSD	Evidence quality	Notes
At-fault fatal crash rate	Zero in 30M-plus commercial driverless miles (Waymo disclosed)	Not applicable (supervised; safety driver present)	Moderate — sample too small for statistical confidence	Waymo’s zero fatalities is significant but expected at this mileage; not proof of human-level safety
Peer-reviewed safety study	6.8x fewer injury crashes (2023, peer-reviewed, Waymo-funded)	No equivalent peer-reviewed driverless study	Moderate — methodology has operational design domain caveat	Best available independent analysis of Waymo safety; caveats important
Regulatory record (NHTSA)	No recall actions; CA DMV permit maintained continuously	Multiple investigations; OTA resolutions; no hard recalls	Even — different systems and products	NHTSA investigations are routine for any technology at scale; OTA resolution is positive outcome
Unsupervised commercial safety data	4-plus years, 30M-plus miles across 4 cities	Weeks, tens of vehicles, Austin only (2026)	Waymo decisive (volume)	Tesla’s sample will grow; meaningful comparison in 2027-2028
Trend direction	Declining incident rate as markets mature (Phoenix best)	Declining disengagement rate annually (est.)	Both positive	Both improving; Waymo’s improvement is in commercial driverless context

Overall Verdict

Waymo has the most rigorous published AV safety record in the world: zero at-fault fatalities in 30M-plus driverless commercial miles, a peer-reviewed 6.8x safety advantage study (with methodology caveats), and 4-plus years of incident disclosure to CA DMV. Tesla FSD has a strong supervised safety record with declining disengagement rates, but its unsupervised commercial safety record is 2026-early-stage with insufficient data for comparison.

The honest conclusion: Waymo has demonstrated meaningful safety in its operational design domain; Tesla has demonstrated improving supervised safety; both need more unsupervised driverless miles to make statistically robust claims about human-level safety parity. The next 24 months — as Tesla’s Austin Robotaxi operation accumulates commercial driverless miles — will be the most important period yet for closing the comparison gap.

Note: All figures labeled “(est.)” are derived from public disclosures, industry research, analyst estimates, and reported data as of mid-2026. This article does not constitute investment advice.