Tags · #release

Cloudflare Infire — disaggregated LLM inference beats vLLM by 20%, Unweight cuts model size 22%

Cloudflare Infire (Rust) uses disaggregated prefill/decode to beat vLLM 0.10 by 20% on H100s. Unweight achieves 15–22% lossless model weight compression.

04 MAY 3 MIN READ Practitioner note inference open-source nvidia release

Google Jules public beta — async coding agent goes global with CLI and GitHub Action

Jules (Gemini 3 Pro) is now global public beta with a GitHub label Action and Jules Tools CLI — the first async GitHub-native coding agent rival to Claude Code.

04 MAY 3 MIN READ Practitioner note google agents github release

Microsoft Agent 365 GA + Agent Framework 1.0 — enterprise agent control plane is live

Agent 365 GA at $15/user: per-agent Entra IDs, Defender MCP blocking. Agent Framework 1.0 is the open-source multi-agent baseline with A2A and MCP interop.

04 MAY 3 MIN READ Practitioner note microsoft agents mcp release

NVIDIA GR00T N1.7 goes commercial — robot foundation model leaves research-only status

GR00T N1.7 VLA moves to commercial early access, removing research-only limits. Jensen Huang previewed GR00T N2 claiming 2× task success vs. current top VLAs.

04 MAY 3 MIN READ Practitioner note nvidia physical-ai robotics release

Copilot in Visual Studio: cloud agents, Debugger agent, .claude/skills/ integration

Copilot in VS gains cloud agent sessions, profile-level custom agents, .claude/skills/ loading, and a Debugger agent that reproduces issues at runtime.

30 APR 3 MIN READ Practitioner note copilot github visual-studio release

Cursor SDK enters public beta — agent harness as shared infrastructure

Cursor released a TypeScript SDK giving programmatic access to the same runtime, harness, and models powering its desktop, CLI, and web apps.

29 APR 3 MIN READ Practitioner note cursor sdk release

Mistral releases Medium 3.5 with 256K context, 77.6% SWE-Bench Verified

Mistral Medium 3.5: 128B dense, 256K context, 77.6% SWE-Bench Verified. Vibe gets cloud remote agents; Le Chat gets a Work Mode.

29 APR 3 MIN READ Practitioner note mistral release agents

vLLM v0.20.0 ships DeepSeek V4 + PyTorch 2.11 + FlashAttention 4

vLLM v0.20.0: 752 commits, 320 contributors. CUDA 13, PyTorch 2.11, Transformers v5, Python 3.14, FlashAttention 4 default, 2-bit KV cache.

27 APR 3 MIN READ Practitioner note vllm inference open-source release

Cursor 3.2 ships /multitask, worktrees, and multi-root workspaces

Cursor 3.2 introduces a /multitask command that spawns parallel async subagents, expands worktrees in the Agents Window, and adds multi-root workspaces.

24 APR 3 MIN READ Practitioner note cursor ide agents release

OpenAI launches GPT-5.5 (codename Spud) with 1M context, $5/$30 pricing

GPT-5.5 (codename Spud) shipped to ChatGPT and Codex Apr 23, API Apr 24. OpenAI cites Terminal-Bench 2.0 82.7% and FrontierMath gains over Opus 4.7.

23 APR 3 MIN READ Practitioner note openai gpt-5 release pricing

Anthropic launches Claude Design — codebase-aware visual prototyping

Claude Design (research preview) generates prototypes, slides, and one-pagers from natural language, and reads codebases to apply design systems.

17 APR 3 MIN READ Practitioner note anthropic claude release design

Anthropic releases Claude Opus 4.7 with improved coding and 3.75 MP vision

Claude Opus 4.7 ships to Claude products, API, Bedrock, Vertex, and Microsoft Foundry. Better coding, ~3x higher vision resolution, same pricing.

16 APR 3 MIN READ Practitioner note anthropic claude release