Skip to content
AI-Daily-Builder

2026-06-18 views

AV Emergency Vehicle Interaction — How Robotaxis Handle Police, Ambulances, and Fire Trucks

Sirens, fire trucks, police hand signals — emergency vehicle interaction is among the hardest AV edge cases and has driven real regulatory action worldwide.

Article 54 in the Physical AI Benchmark Series — Sirens, Stop Arms, and Hand Signals

Emergency vehicles create a compound detection problem that exposes some of the deepest gaps between how human drivers and autonomous systems perceive the world. A human driver hears a siren, locates its source, assesses the situation, and pulls to the right — all within seconds, drawing on spatial hearing, peripheral vision, and decades of trained instinct. An autonomous vehicle must replicate that chain through microphone arrays, camera banks, neural networks, and path-planning algorithms — and do it reliably across every shift of lighting, ambient noise, and traffic geometry.

These scenarios have moved from theoretical edge cases to documented operational failures. San Francisco Fire Department formally complained to the California Public Utilities Commission about Waymo vehicles blocking fire truck access on multiple occasions in 2022 and 2023. The October 2023 Cruise incident — which involved a pedestrian collision followed by a vehicle failure to immediately stop for police — became the most consequential regulatory event in the short history of commercial driverless operations, triggering the suspension of Cruise’s driverless permit and, ultimately, the shutdown of GM’s robotaxi program.

This article maps the technical challenge, documents the real-world record, and explains what the leading AV stacks have built to address it.

All figures marked (est.) are estimates based on published research, public company disclosures, and industry reporting. Performance figures have not been independently verified under controlled test conditions.


Section 1 — Why Emergency Vehicle Interaction Is Hard

Emergency vehicle interaction is not a single problem. It is at least six distinct detection and decision-making challenges that can occur simultaneously.

ScenarioHuman driver approachAV challenge
Ambulance with sirenHear siren → locate source → pull rightAudio detection + localization + pull-over decision
Fire truck approaching from behindMirror check + flashing red → move overMulti-camera rear detection + light pattern recognition
Police officer hand signalEye contact + gesture reading → complyKeypoint-based gesture classification in real time
School bus stop signFlashing red + extended arm → stopArticulated arm detection + flashing lights in variable lighting
Emergency vehicle blocking laneNavigate around with cautionPath planning around unpredictable, non-standard obstacle
Funeral processionRecognize escort context → yieldMulti-vehicle convoy context recognition; jurisdiction-dependent rules

Challenge 1: Multi-modal signal fusion. Emergency vehicles announce themselves through two independent channels — audio (siren) and visual (flashing lights). Human drivers fuse these automatically and unconsciously. An AV must run parallel detection pipelines, correlate their outputs, and avoid false positives from other loud noises or bright strobing lights (construction sites, nightclubs, other emergency vehicles blocks away).

Challenge 2: Non-standard trajectory prediction. Emergency vehicles are legally permitted to run red lights, travel the wrong way, accelerate through intersections, and stop abruptly. Standard AV motion prediction models are trained on normal vehicle behavior. An AV that applies normal prediction to an emergency vehicle will generate dangerous expected-path estimates.

Challenge 3: Pull-over decision and execution. Knowing that an emergency vehicle is approaching is only half the problem. The AV must determine where to pull over — not at an intersection, not in a crosswalk, not in a bike lane, not blocking a driveway — and execute a smooth lane change under time pressure, potentially in dense traffic.

Challenge 4: Police hand signals. An officer directing traffic overrides all traffic signals. This is a purely visual, real-time, human gesture recognition problem. The officer may be partially obscured, wearing different uniforms, using non-standardized gestures, and working at an intersection from any of four approach angles. No standardized dataset exists for training this capability.

Challenge 5: School bus stop arms. When a school bus activates its flashing red lights and extends its stop arm, all vehicles in both directions on an undivided road must stop in all 50 US states. The stop arm is a physical, articulated mechanical component that must be detected even in glare, rain, or low light. Detection distance matters: vehicles need time to brake smoothly.

Challenge 6: Jurisdiction-dependent rules. Funeral procession right-of-way varies by state. School bus stopping rules vary by road type and lane count. The distance required when passing a stopped emergency vehicle varies by state. An AV operating across multiple jurisdictions must carry a ruleset that differs by location.


Section 2 — Real-World Incidents

The emergency vehicle interaction record for commercial AV deployments contains several documented incidents that directly shaped regulatory outcomes.

IncidentDateWhat happenedOutcome
Waymo vehicles blocking fire trucksMultiple incidents 2022–2023, San FranciscoWaymo vehicles pulled over in locations that obstructed fire truck access to incident scenesSF Fire Dept formally complained to CPUC; became a central factor in the CPUC driverless permit debate
Cruise failure to stop for policeOctober 2023, San FranciscoCruise robotaxi failed to immediately stop for police during a traffic stop following a pedestrian collision; vehicle then moved to what it assessed as a safer locationCPUC suspended Cruise’s driverless permit; contributed directly to GM shutting down Cruise robotaxi operations in November 2023
Waymo and officer-directed traffic2023–2024, multiple incidentsWaymo vehicles stopped unexpectedly or behaved inconsistently when officers directed traffic manually at intersectionsLed Waymo to develop improved “pull over and summon remote operator” protocol for ambiguous situations
General school bus detectionOngoingMultiple AV programs have required specific training datasets for school bus stop arm detection; state-by-state passing rules varyNo AV operator has received formal certification specifically for school bus stop arm compliance (est.)

The Cruise October 2023 incident requires specific clarification. The sequence was: a Cruise vehicle struck a pedestrian who had already been hit by another vehicle; the Cruise vehicle then failed to immediately stop when police activated lights; the vehicle subsequently moved approximately 20 feet while the pedestrian was underneath it. CPUC’s suspension of Cruise’s driverless permit cited the failure to stop for police as a specific contributing factor, alongside broader concerns about incident reporting. The regulatory response — permit suspension, followed by GM’s decision to shutter the entire Cruise robotaxi program — was the most severe consequence any AV operator has faced to date.

The Waymo fire truck incidents, while less dramatic, matter because they reveal a subtler failure mode: the AV correctly identified the need to pull over, but selected pull-over locations that blocked emergency vehicle access. Getting the pull-over decision correct requires not just detecting the emergency vehicle but reasoning about where to stop — avoiding locations that would obstruct the very vehicles being yielded to.


Section 3 — Waymo’s Emergency Vehicle Interaction System

Following the SF Fire Department incidents, Waymo disclosed and implemented a systematic multi-layer response to emergency vehicle detection.

System componentWhat it does
Audio detection moduleDedicated microphone array plus neural network for siren detection; localizes siren direction (front/rear/left/right) to inform pull-over planning
Emergency vehicle light pattern recognitionTrained on footage of specific light bar patterns from major US fire, police, and ambulance vehicle fleets; distinguishes emergency patterns from construction, tow trucks, and other strobing sources
Pull-over location selectionWhen siren plus lights are detected → identify nearest safe pull-over point that does not block intersections, crosswalks, bike lanes, or driveways → execute lane change → stop
Remote assistance integrationComplex scenarios (officer hand signals, ambiguous situations) → flag for human remote operator who provides guidance in approximately 30 seconds (est.)
Police stop protocolIf police lights directed at Waymo vehicle → pull over → activate hazard lights → wait for remote operator or officer approach
Geofenced improvement loopAfter incidents, Waymo conducts targeted edge-case data collection in affected zones to build new training examples

The remote assistance integration is architecturally significant. Waymo has publicly described a tiered response model in which the vehicle handles routine emergency vehicle scenarios autonomously and escalates ambiguous cases — particularly officer hand signals — to a human remote operator. This means Waymo’s driverless vehicles are not fully autonomous in the hardest emergency scenarios; they are human-assisted in real time. The 30-second escalation response is an internal estimate and has not been independently audited.

The pull-over location selection improvement directly addressed the fire truck blocking incidents. The updated protocol includes explicit constraints: candidate pull-over locations are filtered against known intersection boundaries, crosswalk markings, bike lane designations, and fire hydrant proximity. The system must balance pulling over quickly against pulling over in a location that does not create a secondary obstruction.


Section 4 — Tesla FSD and Emergency Vehicle Handling

Tesla FSD operates under supervised conditions (safety driver present in current consumer vehicles). The Cybercab driverless program, targeting commercial deployment, must meet a higher regulatory bar for emergency scenarios.

DimensionDetail
Audio detectionTesla vehicles have microphones primarily designed for cabin noise cancellation and voice commands; emergency vehicle siren detection as a separate capability is publicly stated as implemented in FSD (est.)
Visual detectionCamera-based flashing light detection; end-to-end v12 and v13 models trained on emergency vehicle clips drawn from the global fleet
Pull-over behaviorFSD trained to recognize emergency vehicle approach and suggest or execute pull-over; safety driver can override at any time
Police hand signalsHarder problem for camera-only system; FSD v13 included improvements to pedestrian and officer gesture recognition (est.)
No remote operatorUnlike Waymo, Tesla’s driverless Cybercab plan relies on the neural network alone, without a human remote operator in the loop
Regulatory requirementFor Cybercab driverless operation, emergency vehicle handling must meet regulatory standards currently under development and evaluation

The absence of a remote operator backup creates a structural difference in how the two companies approach the hardest edge cases. Waymo can escalate an ambiguous officer hand-signal scenario to a human within seconds. Tesla’s driverless system must handle that same scenario through neural network inference alone. Whether end-to-end model training on fleet data can reach the reliability needed for police-gesture compliance without human backup is one of the unresolved questions in the Cybercab regulatory review.

Tesla’s camera-only architecture also affects siren localization. Waymo’s microphone array can determine whether a siren is approaching from the front, rear, or side — information that directly informs whether and where to pull over. Camera systems must infer emergency vehicle approach direction from visual cues alone, which works reliably once an emergency vehicle is visible in the camera field but provides no advance warning from sound alone.


Section 5 — Regulatory Standards and What Passing Looks Like

No universal federal standard for AV emergency vehicle interaction exists as of mid-2026. The regulatory landscape is state-by-state with California functioning as the de facto most rigorous jurisdiction.

RequirementCurrent status
Pull over for emergency vehiclesRequired under law in all 50 states; AVs must comply with the same statutes as human drivers
School bus stop armFederal law plus all 50 state laws require stopping; no AV has received formal pass/fail certification specifically for stop arm compliance (est.)
Police traffic directionLegal requirement to follow officer directions; no AV-specific technical standard; tested case-by-case in incident review
NHTSA FMVSSFederal Motor Vehicle Safety Standards do not yet include specific emergency-vehicle-response performance requirements as of mid-2026 (est.)
CPUC CaliforniaHas become the most operationally rigorous AV regulatory body through active permit proceedings; the SF incidents drove the first major permit suspension in US history
What passing looks likeNo agreed metric exists; industry direction is that AV must match or exceed human driver emergency vehicle response rates across a defined test set

The CPUC’s authority over driverless permits has made California both the most attractive and the most scrutinized AV deployment market. Operators that expand geofenced operations must demonstrate emergency vehicle interaction performance to CPUC reviewers. The Cruise suspension established that a single high-profile failure can result in permit revocation; the Waymo fire truck incidents established that persistent lower-severity failures can drive operational requirement changes.

The absence of a federal standard creates an uneven competitive landscape. Operators in California face the most rigorous requirements; operators in states with lighter-touch AV regulations face lower formal bars. As driverless deployment scales beyond California, NHTSA is expected to develop federal performance floors — likely drawing on the California record as the primary evidence base.


Sources: CPUC driverless vehicle permit proceedings (cpuc.ca.gov); SF Fire Dept complaints about AV blocking, SF Examiner coverage (sfexaminer.com); Cruise October 2023 incident, NHTSA special investigation (nhtsa.gov/vehicle-safety/automated-vehicles); Waymo emergency vehicle response updates (waymo.com/blog/). All figures marked (est.) are estimates based on published research, public operational disclosures, and industry reporting; they have not been independently verified under controlled test conditions and should be treated as directional rather than precise.


Sources

Tags

Tip