New frontier models, benchmark drops, and side-by-side capability comparisons.
Anthropic released Claude Opus 4.7, a new version of its Claude AI model, along with documentation outlining key changes and a migration guide for developers transitioning from earlier versions.
Anthropic released Claude Opus 4.7, its most capable generally available model, focused on software engineering, image analysis, and instruction following. The company noted Opus 4.7 does not advance its capability frontier, as the separately released Mythos Preview — currently limited to partner...
A Dev.to author published an analysis of Anthropic's Claude Opus 4.7 model, examining changes from previous versions. The article's actual technical content was not available in the retrieved text.
Anthropic released Claude Opus 4.7, which scores 87.6% on the SWE-bench coding benchmark. The release includes breaking API changes and a price increase compared to prior versions.
Anthropic released Claude Opus 4.7, an updated version of its AI model with improvements to vision capabilities, memory, and instruction-following performance.
Google released Gemini 3.1 Flash, a text-to-speech model. Simon Willison published notes and a tool interface for the new model.
Google released Gemini 3.1 Flash TTS, a text-to-speech model available via the Gemini API that generates audio from text prompts and supports detailed voice direction including accents, tone, and delivery style.
Google released Gemini 3.1 Flash TTS, a text-to-speech system, across its products.
ByteDance's Seedance 2.0 video generation model is now available via Vercel's AI Gateway in Standard and Fast variants, supporting text-to-video, image-to-video, and multimodal reference-to-video generation with synchronized audio and video editing capabilities.
The UK's AI Security Institute evaluated Anthropic's Claude Mythos Preview and found it autonomously completed a 32-step corporate network takeover simulation, marking the first AI model to execute such a full multi-stage cyberattack simulation. The model showed improved performance in capture-th...
Kumo announced KumoRFM-2, a foundation model for relational databases that accepts plain-English queries and outperformed supervised machine learning models by 5% on Stanford's RelBench benchmark and beats AWS AutoGluon on enterprise benchmarks, scaling to over 500 billion rows of data.
OpenAI released GPT-5.4-Cyber, a model variant fine-tuned for defensive cybersecurity work, and expanded its Trusted Access for Cyber program allowing identity-verified users reduced-friction access to security tools via government ID verification through Persona.
Google's Gemma 4 E2B model can transcribe audio files on macOS using MLX and mlx-vlm via a uv command-line recipe, as demonstrated on a 14-second voice memo that was substantially transcribed with minor errors.
Google added a feature to Gemini allowing it to generate interactive 3D models and simulations in response to user questions, with controls to rotate models, adjust parameters via sliders, and modify simulations in real-time.
Anthropic announced Project Glasswing with twelve launch partners and revealed Claude Mythos, a frontier model it is not yet shipping to production, along with a system card detailing security findings.
Meta released Muse Spark, a hosted AI model available on meta.ai in "Instant" and "Thinking" modes, with benchmarks competitive with Opus 4.6, Gemini 3.1 Pro, and GPT 5.4 on selected tests. The model has access to 16 tools including web search, Meta platform content search, and rendering capabili...
Anthropic released a system card for Claude Mythos Preview, a new model variant, documenting its capabilities and characteristics.
Z.ai released GLM-5.1, a 754-billion-parameter open-source AI model available under MIT license. The model demonstrated the ability to generate SVG graphics with CSS animations and to debug and fix code issues when given follow-up instructions.
Anthropic released Claude Mythos Preview with accompanying security documentation and initiated Project Glasswing to address software security in the AI era.
Z.ai's GLM 5.1 model is now available on Vercel's AI Gateway, supporting long-horizon autonomous tasks including multi-step coding workflows, conversation, creative writing, and document generation.
Anthropic released a preview of its Mythos AI model for use by select companies in defensive cybersecurity work.
Vercel made Fast Mode available for Claude Opus 4.6 on AI Gateway, offering 2.5x faster output token speeds at 6x standard pricing. The experimental feature is designed for human-in-the-loop workflows and coding tasks.
Anthropic released Claude Mythos Preview, a new frontier AI model, to about 50 select partners including Amazon, Apple, and Microsoft through Project Glasswing for defensive cybersecurity work. The model scores 83.1% on vulnerability analysis benchmarks, compared to 66.6% for the previous flagshi...
Anthropic is testing an early-access model called Claude Mythos after a March 2026 data leak confirmed its existence; the company has not publicly launched it but described it as a significant capability increase, with potential applications in code, cybersecurity, and enterprise operations.
Google Gemini 2.0, OpenAI GPT-5.3, and Anthropic Claude 4.6 were compared across coding and reasoning benchmarks, with GPT-5.3-Codex scoring 56.8% on SWE-Bench Pro, Claude Opus 4.6 at 55%, and Gemini 2.0 at 52%, while Claude led on multi-step reasoning tasks with 95.2% on GSM8K mathematical bench...
Claude Opus 4 scores higher on coding benchmarks (76.8% vs 71.8% on SWE-bench) and offers a larger 200K-token context window, while GPT-5 excels at reasoning tasks and costs less ($10-30 per 1M tokens vs $15-75). Both are available at $20/month subscription tier.
Google DeepMind released Gemma 4, a family of four open-source language models sized at 2B, 4B, 31B, and 26B parameters with Apache 2.0 licensing, featuring native vision and audio processing capabilities.
xAI's Grok 4.20 model is now available on Vercel's AI Gateway in three variants: Reasoning, Non-Reasoning, and Multi-Agent. Developers can access the models through Vercel's AI SDK with specified model identifiers.
Vercel AI Gateway now offers GLM 5V Turbo, a multimodal model from Z.ai that converts screenshots and designs into code and can navigate graphical interfaces. The model is accessible via the AI SDK using the identifier zai/glm-5v-turbo.
Alibaba's Qwen 3.6 Plus model is now available on Vercel's AI Gateway. The model features a 1M context window and improved agentic coding capabilities, multimodal reasoning, and tool-calling performance compared to Qwen 3.5 Plus.
Inception's Mercury 2 model is now available on Vercel's AI Gateway. The model is designed for low-latency applications including agentic loops, coding assistants, and voice interfaces.
Vercel made MiniMax M2.7 available on its AI Gateway in standard and high-speed variants. The high-speed version delivers the same performance at 2x cost with ~100 tokens per second latency.
Google's Gemma 4 models—a 26B mixture-of-experts variant and a 31B dense variant—are now available on Vercel's AI Gateway. Both models support function-calling, vision capabilities, 256K context windows, and 140+ languages.
llm-gemini version 0.30 added three new models: gemini-3.1-flash-lite-preview, gemma-4-26b-a4b-it, and gemma-4-31b-it.