2026-06-18 — views

Physical AI Compute — Waymo Google Cloud TPU vs Tesla Dojo D1: Training Infrastructure Benchmark 2026

Waymo uses Google TPU pods and 15B simulated miles daily. Tesla built Dojo D1 for video training while running NVIDIA H100 clusters in parallel as Dojo scales.

Overview

The AI training compute infrastructure is the engine behind each company’s ability to improve its autonomous driving models. Waymo, as an Alphabet subsidiary, uses Google Cloud TPUs — the same compute ecosystem that powers Gemini and other Google AI systems. Tesla built Dojo, a custom supercomputer using proprietary D1 chips designed specifically for training on video data at massive scale. This article benchmarks the two compute approaches — what each company has, what it costs, and what it means for AI model improvement pace. This is article 165 in the Physical AI Benchmark Series.

Section 1 — Waymo’s Compute Stack: Google Cloud + TPU Ecosystem

Waymo’s training infrastructure is inseparable from its position as an Alphabet subsidiary. Access to Google’s TPU pods — the world’s most advanced AI training infrastructure — is a structural advantage no independent AV startup could replicate.

Compute dimension	Waymo detail	Strategic significance
Primary training infrastructure	Waymo uses Google Cloud TPUs for neural network training; as an Alphabet subsidiary, Waymo has access to Google’s internal TPU pods — the same infrastructure used to train Gemini and other Google AI systems	Being an Alphabet subsidiary gives Waymo access to the world’s most advanced AI training infrastructure at marginal cost; no AV startup could afford equivalent compute independently
Google TPU v4/v5 generation	Google’s TPU v4 pods delivered approximately 1 exaFLOP of compute per pod; TPU v5 (announced 2023) improved performance per watt by est. 2x or more (est.); Waymo has access to these resources as needed	TPU v5 performance represents best-in-class training throughput for transformer and convolutional architectures — the types used in AV perception and planning
Google DeepMind synergies	Waymo has potential access to DeepMind research talent and methodology (both are Alphabet subsidiaries); DeepMind’s work on AlphaFold, Gemini, and robotics overlaps with AV AI challenges	Cross-subsidiary knowledge transfer is not guaranteed or automatic, but organizational proximity matters; DeepMind’s robotics research is directly relevant to Waymo’s prediction and planning problems
Simulation compute (CarCraft)	Waymo’s CarCraft simulation system runs est. 15 billion simulated miles per day (est.) across Google Cloud; simulating rare, dangerous, and novel scenarios at this scale requires massive parallel compute	15B simulated miles per day means Waymo can train on extremely rare edge cases (1-in-a-million scenarios) that real-world miles could never provide in sufficient volume; Google Cloud’s elastic scale makes this feasible
Cost structure	Waymo does not pay market rate for Google Cloud compute; as an Alphabet subsidiary, compute costs are effectively subsidized; Waymo’s training budget is not independently disclosed	This subsidy is an enormous structural advantage: an independent AV startup paying $1B or more per year for equivalent Google Cloud compute would face a capital constraint that Waymo does not
HD mapping compute	Waymo’s HD maps are generated and updated using Google Maps’ base data plus Waymo-specific centimeter-level lidar enrichment; processing the raw lidar point clouds into navigable HD maps requires substantial compute	Google Maps’ existing compute infrastructure for map rendering and processing is leveraged for Waymo’s HD map generation — another invisible subsidy from the Alphabet relationship
Compute strategy verdict	Waymo’s compute approach is depth-over-breadth: use the world’s best AI training infrastructure (Google TPUs) for a narrow, well-defined problem domain (autonomous driving perception and planning), with Google’s simulation scale for edge case coverage. The strategy works well in Waymo’s current operational envelope. The key risk: if AI architectures shift in ways that favor a different compute paradigm, Waymo is dependent on Google’s roadmap rather than its own.

Section 2 — Tesla’s Compute Stack: Dojo D1 + NVIDIA Clusters

Tesla’s compute strategy is the opposite of Waymo’s: rather than leveraging an existing hyperscaler’s infrastructure, Tesla built its own chip and supercomputer specifically optimized for its primary training workload — video.

Compute dimension	Tesla detail	Strategic significance
Dojo supercomputer architecture	Tesla designed the D1 chip (7nm, 362 TFLOPS BF16, 900 GB/s memory bandwidth per chip) specifically for video training; D1 chips tile into training nodes (25 chips per node = 9 PFLOPS), nodes into ExaPOD cabinets (120 nodes = 1.1 EFLOPS per ExaPOD), ExaPODs into the full Dojo cluster	Dojo’s architecture is optimized for Tesla’s specific training workload: large batches of video frames from millions of vehicles. The chip topology (high-bandwidth interconnects between tiles) minimizes data movement overhead for video training
Why Tesla built its own chip	Tesla’s primary training workload is video: billions of 8-camera video segments from 6M vehicles; existing GPU and TPU architectures were not optimally designed for this specific workload pattern; custom silicon allows Tesla to optimize for memory bandwidth, interconnect topology, and precision format for video	Custom silicon development costs hundreds of millions of dollars and takes 3–5 years; Tesla’s justification is that training cost savings over a 5–10 year horizon exceed the development cost — the same logic Apple applied to M-series chips
Dojo vs. NVIDIA GPU clusters	Tesla also uses NVIDIA H100 clusters for training (Dojo supplements, does not fully replace NVIDIA); NVIDIA H100 delivers approximately 2,000 TFLOPS BF16 per GPU; a 10,000-GPU H100 cluster = 20 EFLOPS; Tesla’s combined Dojo and NVIDIA compute is est. among the largest single-company AI compute deployments outside of hyperscalers (est.)	Tesla’s dual-track strategy (Dojo for video-optimized training plus NVIDIA for general AI) reflects pragmatism: H100s are available now; Dojo ramps over time. Running both allows Tesla to continuously improve FSD without waiting for Dojo to mature
Training data pipeline	Tesla’s primary compute advantage is data, not chips: 6M vehicles multiplied by average 1 hour per day FSD engaged multiplied by 8 cameras = enormous daily video volume; labeling is automated via the Data Engine (shadow mode: FSD makes a decision, a human corrects it, the correction becomes labeled training data)	The Data Engine’s compute requirement is itself massive: running shadow mode inference on millions of vehicles and processing the corrections requires significant inference and storage infrastructure, not just training compute
Dojo deployment timeline	First Dojo ExaPOD operational in 2022 (Texas Gigafactory); Musk targeted 100 EFLOPS by late 2024 (est.); actual deployment pace not fully disclosed; Tesla’s subsequent investment in NVIDIA H100 clusters suggests Dojo ramp was slower than planned (est.)	Slower-than-planned Dojo ramp is consistent with custom silicon’s typical timeline overruns; this is not a failure — it is the normal trajectory of a first-generation custom chip. NVIDIA H100 fills the gap until Dojo v2 (next-gen)
Dojo v2 and future compute	Tesla has referenced a next-generation Dojo chip; details not disclosed as of mid-2026 (est.); if Dojo v2 follows the typical 2x performance-per-generation improvement, Tesla’s training compute could reach hundreds of EFLOPS by 2027 (est.)	The trajectory matters more than current capacity: if Dojo v2 delivers and Tesla’s training compute reaches hyperscaler scale, Tesla would be the only non-hyperscaler with proprietary AI training silicon at that level
Compute strategy verdict	Tesla’s compute approach is build-vs-buy at maximum ambition: build a custom chip and supercomputer optimized for your specific training workload, while renting NVIDIA in the interim. The strategy is high-risk (custom silicon often underdelivers), high-reward (if Dojo works as designed, Tesla’s training cost per FSD improvement drops dramatically). The key risk: Dojo D1 may not achieve the performance and yield targets that justify the development cost relative to continued NVIDIA dependence.

Section 3 — Head-to-Head Compute Comparison

Dimension	Waymo / Google TPU	Tesla Dojo + NVIDIA	Edge
Training compute scale (est.)	Access to Google’s full TPU fleet — potentially hundreds of EFLOPS (est.); shared with all Google AI projects	Tesla combined Dojo and NVIDIA est. tens of EFLOPS (est.); dedicated to Tesla AI workloads only	Waymo has access to more total compute; Tesla has more dedicated compute
Compute cost structure	Effectively subsidized (Alphabet subsidiary); no market-rate payments for Google TPU	Mixed: Dojo capex amortized over training lifetime; NVIDIA H100 rented/purchased at market rates; significant but finite	Waymo decisive on compute cost per training run at current scale
Chip customization for AV	TPUs optimized for Google’s workloads (not AV-specific); flexible but not specialized	Dojo D1 designed specifically for video training at AV scale	Tesla decisive on silicon fit-for-purpose; Waymo uses general-purpose AI chips
Data volume for training	Approximately 30M driverless commercial miles (est.); high-purity (fully driverless = clean labels) but lower volume	Approximately 6B supervised FSD miles (est.); lower label purity (human-supervised) but massive volume	Tesla decisive on data volume; Waymo decisive on data purity
Simulation scale	15B simulated miles per day (est.) via CarCraft on Google Cloud	Growing simulation capability via Dojo; scale not disclosed (est.)	Waymo decisive on simulation at current scale
Control over compute roadmap	Dependent on Google’s TPU roadmap (TPU v5 to v6 etc.); no independent chip design	Tesla controls its own chip roadmap; can optimize D1 to D2 for AV-specific needs	Tesla decisive on compute sovereignty and roadmap control
Overall compute verdict	Waymo’s Google Cloud / TPU advantage is structural today: more total compute, lower effective cost, best-in-class TPU performance, unmatched simulation scale. Tesla’s Dojo advantage is strategic over time: dedicated silicon optimized for the specific video training workload, independent roadmap, no sharing with other Alphabet AI projects. The 2028 question is whether Dojo v2 delivers on its performance promise.

Section 4 — What Compute Determines in the AV Race

AI capability	How compute determines it	Waymo advantage	Tesla advantage
Perception accuracy	Better training data and more compute lead to lower detection error rate; perception models must train on billions of labeled frames	Driverless label purity: no human-supervised noise in training data	6B miles of video data; volume enables rare-case coverage
Prediction (other agents)	Modeling human behavior requires training on diverse real-world scenarios; simulation fills gaps that real-world data cannot cover	15B simulated miles per day covers edge cases systematically	Scale of real-world data provides behavioral diversity that simulation approximates
Planning (what to do)	Planning policy training requires simulation at scale to test edge cases safely; real-world testing is too dangerous and expensive for rare scenarios	Google Cloud simulation scale is decisive for planning policy improvement	End-to-end FSD v12 collapses perception and planning into one network — reduces the compute problem from two steps to one
Generalization (new cities)	Generalizing to new cities requires either: (a) training on data from that city, or (b) compute-intensive simulation of that city’s scenarios	HD map and simulation approach means Waymo must generate maps and simulate each new city before commercial launch	Tesla’s mapless FSD approach means no city-specific simulation required; the model generalizes from the training distribution
Model iteration speed	Faster training compute leads to more experiments per week and faster model improvement	More total TPU access means more simultaneous experiments possible	Dedicated Dojo compute means no contention with other Google AI projects

Section 5 — Compute Benchmark Scorecard

Dimension	Waymo / Google	Tesla Dojo + NVIDIA	Edge	2028 outlook
Total training compute access	Decisive — Google TPU fleet is among the largest AI compute deployments on Earth	Large but not at Google scale	Waymo (current)	Tesla closes gap as Dojo scales
Compute cost efficiency	Decisive — effectively subsidized as Alphabet subsidiary	Market-rate NVIDIA plus Dojo capex	Waymo (current)	Depends on Dojo D2 delivery
Silicon fit for AV workload	General-purpose TPU (flexible but not AV-optimized)	Dojo D1 designed for video training (AV-optimized)	Tesla	Tesla’s purpose-built silicon is a long-term advantage if it delivers
Compute roadmap control	Dependent on Google TPU roadmap	Independent Dojo roadmap	Tesla	Tesla’s control over silicon roadmap is a strategic asset
Simulation scale	Decisive — 15B simulated miles per day (est.)	Growing; scale not disclosed (est.)	Waymo (current)	Both scale; Waymo head start significant
Training data quality × volume	Higher purity (driverless), lower volume	Lower purity (supervised), much higher volume	Depends on use case	Volume advantage compounds as Tesla fleet grows
Overall verdict	Waymo has the superior compute infrastructure today by most metrics: more total TPU access, lower effective cost, and the world’s best simulation scale. Tesla’s bet is that Dojo — purpose-built for video training — will eventually deliver lower cost per training run than general-purpose TPUs, and that data volume (6M vehicles) will more than compensate for lower label purity. The 2028 compute race is Dojo v2 vs TPU v6: which chip roadmap better serves the specific demands of training a generalist AV policy at scale.

All figures labeled (est.) are derived from public company disclosures, analyst estimates, and industry benchmarks. This article is part of the Physical AI Benchmark Series — article 165.