2026-06-18 — views
Physical AI VRU Safety 2026 — Waymo Multi-Sensor Detection vs Tesla FSD Camera-Only Night Performance: The AV Safety Benchmark
Waymo LIDAR detects pedestrians at night as well as noon. Tesla FSD camera-only uses headlights and neural nets. Night VRU safety is the key battleground.
Article 197 in the Physical AI Benchmark Series — Vulnerable Road User Safety
Pedestrians, cyclists, and other vulnerable road users (VRUs) are the highest-stakes safety problem in autonomous vehicle design. A 2-ton autonomous vehicle colliding with a pedestrian at 25 mph produces near-certain severe or fatal consequences for the VRU. The architectural question at the heart of this benchmark is therefore not merely a performance question — it is a safety architecture question: does a multi-sensor approach with active LIDAR illumination provide structural safety advantages over a camera-only neural network approach in the exact conditions — darkness, rain, occlusion — where VRU collisions are most likely to occur?
Waymo’s multi-sensor fusion stack (LIDAR + radar + camera) and Tesla FSD’s camera-only neural network represent the two dominant AV design philosophies at commercial scale in 2026. This benchmark systematically compares their VRU detection capabilities across five dimensions: VRU category risk, multi-sensor vs camera-only detection, night performance (the critical battleground), cyclist prediction, and an overall safety scorecard.
Section 1 — Why VRU Safety Is the Highest-Stakes AV Challenge
VRUs — pedestrians, cyclists, motorcyclists, children, wheelchair users, e-scooter and e-bike riders — occupy the most vulnerable position in the traffic ecosystem. Unlike vehicle-to-vehicle collisions where crumple zones, seatbelts, and airbags distribute crash energy across two protected occupant compartments, a VRU collision exposes a human body to the full kinetic energy of a 2-ton vehicle with no protective structure of their own. At 25 mph, a pedestrian impact carries approximately 4 times the kinetic energy of a 10 mph impact — a difference that determines whether an injury is moderate or fatal.
AV systems must detect VRUs across a challenging matrix of conditions:
- Low ambient light: pedestrian fatalities in human-driven traffic peak in late evening and nighttime hours (est. 75% of US pedestrian fatalities occur in dark conditions, per NHTSA traffic safety data); any AV system must match or exceed human-driver VRU detection in precisely the conditions humans perform worst
- Adverse weather: rain and fog reduce camera visibility significantly; a VRU in dark clothing in fog at night is a hard camera detection problem
- Long range: at 45 mph (city highway speed), safe braking requires est. 50–70 meters of detection range; at 25 mph in a residential zone, detection range requirements are lower but reaction time is still critical
- Partial occlusion: a child stepping from behind a parked car gives est. 0.5–1.5 seconds of reaction time from first detection to complete stop — less than most human driver reaction times; an AV system that can slow proactively at known occlusion points before any VRU is detected has a structural safety advantage
- Unpredictable behavior: children chasing balls into the street, intoxicated pedestrians reversing direction mid-crosswalk, cyclists swerving suddenly — AV systems must predict and yield to VRU behavior that does not follow normal traffic patterns
VRU categories and their specific detection challenges:
| VRU category | Detection challenge | Speed range | Most dangerous scenario |
|---|---|---|---|
| Pedestrians (adults) | Upright bipedal silhouette is the clearest VRU category for detection; challenge is dark clothing at night | Walking: 1.2–1.8 m/s | Crossing mid-block at night in dark clothing |
| Children | Shorter stature reduces detection range for camera systems; faster/less predictable movement | Walking: 1.0–1.5 m/s; running: 3–5 m/s | Running into street from between parked cars |
| Cyclists | Narrow profile, dynamic speed and direction changes, interaction with traffic lanes | Cycling: 4–8 m/s urban | Sudden lane swerve to avoid obstacle; at-speed intersection crossing |
| Motorcyclists | Narrow radar cross-section (hardest VRU for radar); lane-splitting in CA; high speed | 15–35 m/s urban/highway | Lane-splitting between traffic; sudden braking |
| E-scooter/e-bike riders | Faster than pedestrians, often without lights, sometimes wrong-way in bike lanes; unusual classification for older AV models | 4–8 m/s | Night riding without lights, wrong-way in bike lane |
| Wheelchair users | Low silhouette; may move in roadway when sidewalks are blocked; slower than pedestrians | 0.5–1.5 m/s | Crossing at non-designated point; in roadway due to blocked sidewalk |
The regulatory environment for VRU safety is tightening:
- NHTSA Standing General Order (2021): requires AV operators and partial automation system operators to report crashes to the NHTSA AV crash database within 24 hours (serious crashes) or annually (minor crashes); the database is publicly accessible
- California DMV AV incident reporting: requires all AV operators to report any incident involving their AV, not just crashes; California is the primary regulatory environment for both Waymo and Tesla AV operations
- ISO 21448 (SOTIF): Safety Of The Intended Functionality specifically catalogs VRU interaction as a Tier 1 test scenario category; SOTIF compliance is becoming a de facto requirement for commercial AV deployment in regulated markets
- EU AV Regulation (expected 2025–2026): places VRU protection as a Tier 1 safety requirement; draft regulation language emphasizes sensor redundancy for driverless operations specifically in VRU-dense urban environments
The public scrutiny context amplifies the stakes: a single AV/pedestrian incident receives disproportionate media coverage relative to equivalent human-driver incidents. One high-profile VRU incident can trigger regulatory review, fleet suspension, and substantial public confidence loss — as demonstrated by the October 2023 Cruise suspension following a pedestrian collision in San Francisco. The VRU safety record is therefore not only a safety metric but an existential commercial risk metric.
Section 2 — Waymo’s VRU Detection: Multi-Sensor Fusion Advantages
| VRU detection dimension | Waymo approach | Details | Safety implication |
|---|---|---|---|
| LIDAR-based VRU detection (day and night) | Active sensor: emits laser pulses; measures time of flight; detection accuracy is independent of ambient light | LIDAR detects pedestrian-shaped objects (upright bipedal silhouette) with centimeter-level spatial resolution at est. 100–300 meters (est.); detection is identical in darkness and full sunlight — LIDAR does not rely on reflected ambient light | Night LIDAR VRU detection is Waymo’s structural safety advantage: at the hours when pedestrian fatalities are highest (late evening/night), LIDAR maintains full detection capability while camera-based systems face their largest performance gap |
| Radar-based VRU velocity measurement | Radar measures Doppler velocity of objects; distinguishes moving pedestrians (Doppler matches human walking speed ~1.4 m/s) from stationary objects even in zero-visibility fog | Radar penetrates rain and fog that obscures cameras; provides VRU velocity even when LIDAR and camera visibility are degraded | Radar VRU detection is particularly valuable in San Francisco’s frequent coastal fog; LIDAR (spatial) + radar (velocity) + camera (visual classification) = three independent VRU detection pathways |
| Camera-based VRU visual classification | Cameras provide semantic VRU information: body pose (facing toward or away?), pedestrian intent (looking at phone vs making eye contact?), cyclist hand signals, child vs adult recognition, wheelchair classification | Camera provides behavioral context that LIDAR point clouds and radar Doppler cannot: a pedestrian at a crosswalk looking at their phone vs one making eye contact with the driver is behaviorally different even if spatially identical | Camera is the VRU behavioral intent layer; LIDAR is the precise spatial position layer; radar is the velocity layer; three-sensor fusion enables reliable detection AND behavioral prediction simultaneously |
| Occlusion handling | Waymo’s HD map provides context for occlusion scenarios: the system knows that parked cars at a specific crosswalk create a partial occlusion zone where a pedestrian could emerge; slows proactively before any VRU is detected | Map-informed occlusion awareness allows Waymo to slow proactively before any sensor detects a VRU — it knows from the map that a pedestrian COULD emerge from behind a parked car at a specific crosswalk location | HD map + LIDAR spatial awareness + behavioral prediction = an occlusion-safety system with multiple independent safeguards; a child running from behind a parked car triggers proactive slowing before the child appears to any sensor |
| Cyclist and micro-mobility prediction | Waymo has trained cyclist behavior prediction models on years of commercial data from SF and Phoenix; prediction estimates cyclist position in 2–5 seconds based on heading, speed, and road context | Cyclist behavior is harder to predict than pedestrian behavior (cyclists move faster and interact with traffic more dynamically); Waymo’s prediction has been trained on real urban cyclist behavior across multiple cities and years | Long training history on real commercial urban cyclists is a meaningful advantage; early AV systems struggled with urban cycling because cyclists were underrepresented in training datasets |
| Safety record (VRU) | Waymo’s NHTSA SGO and CA DMV incident reports show some low-speed incidents; Waymo has published safety reports citing zero life-threatening VRU injuries or fatalities in commercial driverless operations through mid-2026 (as publicly reported) | Full incident database available via NHTSA SGO and California DMV public records; media-cited incidents include a vehicle striking a cyclist’s bike while the cyclist was uninjured, and a vehicle stopping abnormally causing a minor rear-end collision by a human driver | Waymo’s VRU safety record in commercial driverless operations is strong relative to human-driver baseline; however, the fleet is small (est. 2,500+ vehicles, est. 150,000+ rides/week) vs Tesla’s (est. 6M+ vehicles) — statistical comparison requires rate normalization, not absolute counts |
Section 3 — Tesla FSD’s Camera-Only VRU Detection
| VRU detection dimension | Tesla approach | Details | Safety implication |
|---|---|---|---|
| Camera-only VRU detection | Tesla FSD relies exclusively on cameras for VRU detection (no LIDAR, no radar in recent Model 3/Y with Tesla Vision); FSD’s neural network must detect all VRU categories from camera input alone, across all lighting and weather conditions | Camera-based VRU detection is harder than LIDAR-based in low-light conditions: cameras need ambient or active light to create contrast between the VRU and the background; a pedestrian in dark clothing at night, illuminated only by vehicle headlights, is a harder detection problem than the same scene in daylight | Night VRU detection is the primary structural limitation of camera-only AV: est. 75% of US pedestrian fatalities occur in dark conditions (NHTSA data); a camera-only AV system must match LIDAR-equivalent VRU detection performance at night through neural network and headlight engineering alone |
| End-to-end VRU learning | Tesla’s end-to-end FSD neural network has been trained on est. 6 billion+ supervised miles of human driving data, including millions of VRU interaction scenarios across diverse geographies, lighting conditions, and weather | Scale advantage: Tesla’s training data includes proportionally more VRU scenario diversity than Waymo’s, simply because the fleet is vastly larger and includes consumer driving across all road types and times of day | Training data scale is a VRU scenario diversity advantage; quality limitation: human driver behavior in VRU scenarios is not always the safety gold standard — training data includes human VRU detection errors as well as correct responses |
| Night VRU detection with active headlights | Tesla uses vehicle headlights to illuminate the road ahead for camera detection; FSD cameras are designed for low-light performance with high-sensitivity image sensors; night FSD performance has improved significantly with each neural network generation | Headlight-illuminated camera detection works well for VRUs in the direct forward headlight cone; challenges remain for VRUs approaching from the periphery (side streets, driveways) and for complex ambient lighting scenarios | Tesla’s night camera performance is significantly better than standard automotive cameras; but active LIDAR illumination illuminates the scene at high spatial resolution in all directions simultaneously, while headlights illuminate primarily the forward cone |
| Cyclist prediction | FSD has been trained on billions of human-driver responses to cyclists across the US consumer fleet; cyclist prediction is an area where FSD has demonstrated strong improvement in consumer deployment | Consumer FSD users have reported both good and poor cyclist handling; Tesla does not publish systematic cyclist interaction performance data | Consumer fleet deployment means FSD encounters vastly more cyclist scenarios per week than Waymo; this scale of cyclist scenario experience is an advantage for prediction model improvement through continuous retraining |
| Known FSD VRU limitations (reported) | NHTSA investigations have included probes into FSD behavior near emergency vehicles (VRU-proximate environments) and highway construction zones (where workers are VRUs); a 2023 FSD v11.x recall involved crosswalk pedestrian behavior | Each NHTSA recall/investigation represents a VRU scenario where FSD behavior was deemed insufficient; OTA updates resolved reported issues; but the pattern of camera-only VRU edge cases leading to recalls vs Waymo’s multi-sensor redundancy is a structural architecture difference | Camera-only VRU detection requires continuous neural network improvement to address edge cases that LIDAR would handle through independent active sensing |
| Safety record (VRU, consumer FSD) | Tesla files with NHTSA under the Standing General Order for AV crashes; Tesla reports indicate the majority of consumer FSD crashes involve rear-end collisions and lane-change errors, not VRU collisions; VRU-specific rates are not separately broken out in public reports | NHTSA SGO crash database is publicly available; VRU-specific analysis requires filtering by crash type; Tesla’s consumer fleet generates more crashes in absolute terms (vastly more vehicle-miles), but the relevant metric is crashes per million FSD-engaged miles — not separately disclosed | Tesla does not publish FSD-engaged VRU collision rate per million miles; without this rate, direct comparison to Waymo’s driverless VRU record is not methodologically valid |
Section 4 — Night Safety: The Critical VRU Battleground
Night is where the architectural difference between LIDAR-based and camera-only VRU detection has its largest safety implication. NHTSA pedestrian fatality data shows that approximately three-quarters of US pedestrian fatalities occur in dark conditions — making night performance the single most important dimension of VRU safety, not merely one of many dimensions.
| Night VRU dimension | Waymo LIDAR-based | Tesla camera-based | Why this matters |
|---|---|---|---|
| Fundamental detection mechanism | LIDAR emits its own 905nm laser pulses; detection is independent of ambient light; a pedestrian in all-black clothing at night is detected at the same range and resolution as in daylight | Camera requires reflected light (ambient streetlights or vehicle headlights); low-ambient-light environments require high-sensitivity image sensors and neural network adaptation for sensor noise | LIDAR’s night detection is physically the same as daytime detection — no degradation; camera-based detection has a physical performance ceiling at night that LIDAR does not face |
| Detection range at night (est.) | LIDAR detects pedestrian-sized objects at est. 100–200 meters in nighttime conditions (est.); radar detects moving VRUs at even longer range with less spatial resolution | Camera-based detection range at night is limited by headlight throw distance: est. 50–100 meters for low beam, est. 150–200 meters for high beam (est.); VRUs outside the headlight cone may not be detected until much closer | Braking distance at 35 mph requires approximately 35 meters; at 45 mph it requires approximately 55 meters; headlight range (especially low beam) may be insufficient at higher urban speeds for sudden VRU appearance |
| Partial occlusion at night | LIDAR’s 360-degree coverage detects VRU reflection from any direction simultaneously; a pedestrian stepping off a curb from the side is detected in all lighting conditions at any approach angle | Camera headlights illuminate primarily the forward cone; a pedestrian approaching from the side or rear is not in the headlight beam and may not be visible to forward-facing cameras until they enter the forward arc | Waymo’s LIDAR provides 360-degree night-time VRU detection; Tesla’s headlight-illuminated cameras cover primarily the forward arc — a structural detection geometry difference |
| Pedestrian fatality rate context | NHTSA data: human-driver pedestrian fatalities peak in late evening/night; dark conditions (no streetlights, or insufficient headlights) are the most dangerous pedestrian collision environment; any AV system must exceed human-driver performance in this highest-risk lighting condition | Any AV VRU safety claim must specifically address night performance — the hours when pedestrian fatalities are highest are precisely where LIDAR has its largest structural performance advantage over camera-only systems | This is the single most important VRU safety dimension: the gap between LIDAR-based and camera-only night detection aligns exactly with the gap between highest-risk and lower-risk pedestrian collision hours |
| Weather degradation | Rain and fog degrade LIDAR at extreme densities; however radar penetrates both conditions reliably; the LIDAR + radar + camera combination provides redundant VRU detection even in poor weather | Rain and fog degrade camera visibility directly; heavy rain can obscure pedestrians to forward cameras at the distances needed for safe braking; neural networks trained on adverse weather improve performance but cannot overcome physical light-blocking by precipitation | Sensor redundancy means Waymo has backup VRU detection (radar velocity) even when primary sensors (LIDAR, camera) are degraded by weather; camera-only has no independent fallback sensor |
The night safety analysis converges on a single structural conclusion: LIDAR’s light-independent VRU detection operates with no performance degradation in the exact hours (late evening to midnight) when pedestrian fatalities are highest and camera-based detection faces its largest performance penalty. This is not a marginal difference — it is a safety architecture difference in the highest-consequence operating condition for VRU safety.
Section 5 — VRU Safety Benchmark Scorecard
| VRU safety dimension | Waymo | Tesla FSD | Edge | 2028 outlook |
|---|---|---|---|---|
| Night VRU detection | High: LIDAR provides light-independent VRU detection; no degradation at night vs daytime | Moderate: camera-based detection is limited by headlight range and ambient light; neural network engineering significantly mitigates but cannot match LIDAR physics | Waymo — structural LIDAR advantage in the highest-risk lighting conditions | LIDAR cost reduction and camera neural network improvements continue; gap narrows but LIDAR retains physical night detection advantage through 2028 |
| Multi-sensor VRU redundancy | High: LIDAR (spatial) + radar (velocity) + camera (visual classification) = three independent detection pathways; any single sensor failure is compensated by two others | Low: camera-only means a single sensor failure mode (lens contamination, glare artifact, neural network edge case) has no independent fallback for VRU detection | Waymo — three independent sensor pathways vs one | Tesla without radar/LIDAR has no architectural path to sensor redundancy; this is a fundamental not an incremental difference |
| Cyclist behavior prediction | High: years of commercial driverless urban cyclist data from SF and Phoenix; prediction trained on real commercial cycling scenarios | High: billions of human-driver miles including enormous cyclist scenario diversity; scale advantage in scenario breadth | Roughly equal — different advantages: Waymo = driverless-context quality; Tesla = scale/diversity | Both improve with more data; comparison requires published per-scenario performance metrics not currently available |
| Child and micro-mobility detection | Strong: LIDAR detects all physical objects regardless of height or size; children detectable at the same range as adults; e-scooters’ small radar cross-section is compensated by LIDAR | Training-data dependent: relies on neural network trained on child/micro-mobility scenarios; children’s shorter height is a known challenge for camera-based detection at longer range | Waymo — LIDAR size-independent detection is a structural advantage for shorter VRUs | Neural network improvements help Tesla; LIDAR size-independence remains a fundamental architecture advantage through 2028 |
| Occlusion safety | Strong: HD map provides proactive crosswalk slow-down before any VRU is detected; LIDAR provides spatial context around occluding objects | Standard: FSD infers occlusion risk from visual scene context; no map-based proactive slow-down at known occlusion locations | Waymo — HD map + LIDAR enables proactive safety margins at known occlusion points | Tesla’s end-to-end model can learn occlusion-risk inference from training; map-based proactive slowing remains a Waymo-specific structural capability |
| Regulatory incident transparency | High: NHTSA SGO + California DMV incident reports publicly available; Waymo publishes annual safety reports with specific safety metrics | Moderate: NHTSA SGO reports filed; VRU-specific collision rates not separately published; FSD-engaged crash rate not disclosed | Waymo — more transparent VRU safety reporting enables external verification | Both companies face increasing regulatory requirements for VRU-specific data; mandatory VRU-rate disclosure likely by 2028 |
Overall verdict: Night VRU detection is where the multi-sensor vs camera-only architectural difference has the clearest and most consequential safety implication. LIDAR’s light-independent VRU detection is an active safety mechanism in the exact conditions where human-driver pedestrian fatalities are highest. Tesla’s camera-only approach requires the neural network to solve a fundamentally harder perception problem at night — and while FSD has improved dramatically with each version, camera-only night detection cannot match LIDAR’s physics-level night detection capability without a fundamental sensor architecture change.
For commercial driverless operations at scale, the VRU safety regulatory environment is moving toward multi-sensor redundancy requirements. The EU’s draft AV regulation and US NHTSA’s AV safety framework both emphasize sensor redundancy for fully driverless (not supervised) AV service. Tesla’s camera-only architecture may face increasing regulatory friction as VRU safety requirements tighten for driverless deployment specifically — not for supervised consumer FSD, which operates under a different regulatory tier. The structural sensor advantage belongs to Waymo in the current regulatory and safety-architecture landscape; whether Tesla’s neural network improvement trajectory can close the gap by 2028 is the key question for the next generation of this benchmark.
Sources: NHTSA Standing General Order AV crash database (nhtsa.gov); California DMV AV incident reports (dmv.ca.gov); Waymo safety report (waymo.com/safety); NHTSA pedestrian safety data (nhtsa.gov/road-safety/pedestrian-safety). All figures marked (est.) are estimates based on public disclosures, regulatory filings, and third-party reporting; they have not been independently verified.
Sources
- NHTSA AV crash reporting — Standing General Order database ↗
- California DMV AV incident reports ↗
- Waymo safety report — Waymo safety.waymo.com ↗
- NHTSA pedestrian fatality data — traffic safety facts ↗