2026-06-10 — views

Google Puts Gemini 3.5 Pro Into Final Preview Stretch With 2M-Token Context and Deep Think Reasoning

Read this because The pricing and the free-trial window are what builders should check first. At $15/$60 per million tokens, Pro is priced to be used seriously — not sampled.

Gemini 3.5 Pro is in final Vertex enterprise preview with a 2M-token context window and Deep Think reasoning; GA launch expected imminently in June 2026.

What is happening

Google’s Gemini 3.5 Pro has entered its final days of limited enterprise preview ahead of general availability. Sundar Pichai announced the model at Google I/O on May 19 and told the live crowd to expect it within the month. As of June 10, select Vertex AI enterprise customers have had access since late May, but no public model ID has appeared in Google’s official API changelog, meaning broad availability has not yet shipped — though a GA announcement is expected any day.

The naming marks a generational step. Gemini 3.5 Pro absorbs the role Gemini Ultra previously played: it is Google’s new apex model, positioned above the already-strong Gemini 3.5 Flash that launched May 19.

What the specs confirm

Feature	Detail
Context window	2 million tokens
Reasoning mode	Deep Think (extended inference-time compute)
Modalities	Text, images, audio, video
Expected input price	~$15 per 1M tokens
Expected output price	~$60 per 1M tokens
Flash reference	$1.50 / $9.00 per 1M tokens

The 2-million-token context window is the headline for long-document and multi-repository workloads. Deep Think is analogous to the chain-of-thought path in OpenAI’s o-series: the model spends additional compute at inference time before returning an answer. For tasks where latency is acceptable and accuracy is paramount — legal document review, multi-step code reasoning, hard math — this mode is the intended entry point.

The pricing gap is the trade-off

At roughly ten times the cost of Flash, Pro targets workloads that genuinely need frontier reasoning depth or extended context. Flash already outperformed Gemini 3.1 Pro on agentic coding evaluations at I/O; developers choosing between waiting for Pro and shipping on Flash now face a real trade-off. Flash is available, competitively priced, and already strong. Pro offers the headroom for tasks that overwhelm Flash — and the pricing signals that Google expects those tasks to be a meaningful slice of the market.

What builders should do now

First, confirm whether your workload actually needs 2M context or extended reasoning. Flash already handles most coding, summarisation, and agentic tasks at a fraction of the cost. Second, watch the API changelog closely — when a gemini-3-5-pro model ID appears, the free preview tier typically goes live at the same time. Third, if you are already on Vertex AI enterprise, check your account for preview access; several teams have been running Pro since late May.

The bottom line: Gemini 3.5 Pro is the most capable model Google has ever shipped publicly. The question for most builders is not whether to use it, but whether their workload justifies the cost premium over Flash.