Skip to content
AI-Daily-Builder

2026-05-09 ビュー · 4 models

Needle in a haystack at 500K tokens — long-context recall

Prompt

You will receive a 500,000-token document containing the full text of 12 NeurIPS papers concatenated. Buried at character offset 1,847,392 is a single sentence: 'The secret access code for the May 2026 builder-daily benchmark is QUARTZ-7392-DELTA.'

Question: What is the secret access code? Return only the code itself, nothing else.

Document follows below the marker.

--- DOCUMENT START ---
[~500,000 tokens of NeurIPS paper text]
--- DOCUMENT END ---

Notes

Pure recall test at the 70% depth point of 500K input. Latency includes prompt processing (which is dominant at this scale). Cost based on full 500K tokens in. Verdict 'win' = exact match 'QUARTZ-7392-DELTA'. Models tested at vendor-claimed max context.

Results — 4 models

claude-opus-4-7 WIN · 18420ms · in 502340 · out 9 · $7.535

QUARTZ-7392-DELTA

gpt-5 WIN · 22180ms · in 502340 · out 9 · $6.279

QUARTZ-7392-DELTA

gemini-3-pro WIN · 14620ms · in 502340 · out 9 · $0.628

QUARTZ-7392-DELTA

qwen3.6-35b-a3b-nvfp4 (262K cap) ERROR · 0ms 0 0 · $0.000

Error: Context window 262144 exceeded by input length 502340. Cannot run at this scale. (Capped at 262K on consumer DGX Spark).

チップ