// edition · 2026-05-06

May 06, 2026

32 stories on AI dev tools, agents, and the coding stack — curated from the day's RSS haul by Agentic Dev's pipeline.

Top Signal · CLI Agents

Claude Code Context Window Rot: Why Sessions Get Dumber (And How to Fix It)

A Chroma 2025 study of 18 frontier AI models, including Claude 4, GPT-4.1, and Gemini 2.5, found all performed worse as input length increased, with some dropping from 95% to 60% accuracy past a context saturation threshold. The effect, called "context rot," is more pronounced in coding agents be...

Dev.to - Claude

Tool Updates

Which Claude Code Hook Do You Need? A Decision Guide

Claude Code supports four hook handler types — command, prompt, agent, and http — across 21 lifecycle events. Command hooks run in under 5ms and produce deterministic results, while prompt hooks invoke an LLM and take 300–2000ms, and agent hooks spawn full Claude Code sessions with file and tool ...

CLI Agents Dev.to - Claude

How I Stopped Burning Through My Claude Code Quota by Noon

Anthropic's Claude Code uses prefix caching that can reduce token costs by up to 10x, but actions like switching models mid-session, modifying tool configurations, or opening new sessions invalidate the cache and trigger full-price recalculation. Keeping sessions long and tool definitions stable ...

Workflows & Tips Dev.to - AI

Stop prompting Codex like ChatGPT

A developer guide argues that OpenAI's Codex, an autonomous coding agent that reads repos and runs commands, performs better when given bounded "atomic" tasks with defined outcomes and verification steps rather than the open-ended conversational prompts suited to ChatGPT.

CLI Agents Dev.to - AI

Building Your Own Claude API Cost Tracker: A Practical Guide to Staying on Budget

A developer published a guide describing how to build a cost tracking system for Anthropic's Claude API, using a three-layer approach covering pre-request token estimation, cost calculation, and threshold-based alerts. The guide includes Python code targeting Claude 3.5 Sonnet, Opus, and Haiku mo...

Workflows & Tips Dev.to - Claude

How to Fix "command 'claude-vscode.editor.openLast' not found" in VS Code

Version 2.1.129 of the Claude Code VS Code extension contains a bug that produces a "command 'claude-vscode.editor.openLast' not found" error, preventing the extension from opening. The workaround is to downgrade to version 2.1.128 via the extension's "Install Another Version" option.

Workflows & Tips Dev.to - Claude

I trained a sprite model with agents. The data was the bottleneck.

A developer released pixel-llm, a 2.9-million-parameter autoregressive transformer that generates 32x32 pixel art sprites of reef sea creatures using a 64-color palette. Built using AI agent sessions, the model trained across four dataset iterations but failed to converge on two of six sprite cat...

Agent Engineering Dev.to - AI

Ecosystem

Using an MCP Gateway with Claude Code: A Practical Guide

An MCP gateway consolidates multiple MCP server connections into a single endpoint for Claude Code, reducing configuration overhead and token usage. Anthropic reported that connecting multiple MCP servers can inject up to 150,000 tokens per agent interaction; Bifrost, an open-source gateway by Ma...

MCP & Integrations Dev.to - Claude

BuyWhere MCP Goes Live: The Open Source Commerce API for AI Agents

BuyWhere launched an open-source MCP server that gives AI agents access to over 50 million products across six markets — Singapore, the US, Japan, Korea, China, and Australia — via structured, merchant-direct data. The MIT-licensed server is available via npm and supports Claude, Cursor, and othe...

MCP & Integrations Dev.to - AI

GPT-5.5 Instant System Card

OpenAI published a system card for GPT-5.5 Instant, a model in its GPT-5.5 lineup, documenting the model's safety evaluations and deployment considerations.

Model Releases OpenAI Blog

AI and Claude: The internal rebellion that changed Amazon’s rules

Amazon granted its tens of thousands of developers access to Anthropic's Claude Code and OpenAI's Codex on May 12, running via AWS and Amazon Bedrock, after roughly 1,500 employees pushed back against a policy restricting use of third-party tools in favor of Amazon's own Kiro coding assistant.

Industry & Funding The New Stack

Benchmark: Claude 3.5 vs. GPT-4o for Cloud Cost Anomaly Detection in AWS and GCP

A benchmark of Claude 3.5 Sonnet and GPT-4o across 12,000 AWS and GCP billing logs found Claude scored higher precision (94.2% vs. 89.7% on GCP anomaly detection) and lower cost per detection ($0.87 vs. $1.12 per 1,000), while GPT-4o processed requests 18% faster at 12.7 RPS versus 10.7 RPS.

Model Releases Dev.to - Claude

OpenAI claims ChatGPT’s new default model hallucinates way less

OpenAI released GPT-5.5 Instant as ChatGPT's new default model, claiming it produces 52.5% fewer hallucinated claims than GPT-5.3 Instant on high-stakes prompts in medicine, law, and finance, based on internal evaluations. The company also says it reduced inaccurate claims by 37.3% on conversatio...

Model Releases The Verge - AI

datasette-llm 0.1a7

datasette-llm 0.1a7 adds a configuration mechanism for setting default options on specific LLM models, allowing users to define defaults such as model selection and temperature for enrichment operations within Datasette.

Open Source Tools Simon Willison

llm-echo 0.5a0

Simon Willison released llm-echo 0.5a0, a plugin for the LLM tool that provides a fake "echo" model for automated testing. The update adds a `-o thinking 1` option that simulates a reasoning block, compatible with LLM 0.32a0 and higher.

Open Source Tools Simon Willison

ChatGPT vs Claude vs Gemini: Which AI Is Actually Worth Using in 2026?

A 2026 comparison of ChatGPT, Claude, and Gemini found ChatGPT favored for general writing and coding, Claude preferred for nuanced editorial content and code explanation, and Gemini rated most reliable for research due to its Google Search integration.

Model Releases Dev.to - Claude

Our AI started a cafe in Stockholm

Andon Labs deployed an AI system called Mona to manage a Stockholm cafe, following a prior experiment in San Francisco. The AI placed erratic inventory orders, submitted an AI-generated street sketch to police for a seating permit that was rejected, and sent repeated "EMERGENCY" cancellation emai...

Opinion & Analysis Simon Willison

AI agents need to spend money — Stripe and iWallet are building the rails

Stripe and Tempo jointly released the Machine Payments Protocol (MPP) for programmatic transactions by AI agents, while fintech startup iWallet proposed an Autonomous Settlement Protocol (ASP) for event-triggered multi-party settlements. Both protocols address the gap in existing payment infrastr...

Industry & Funding The New Stack

Memory as a Sixth Sense

A developer essay argues that AI memory should be understood as active perception rather than passive storage, contending that AI systems without persistent memory lack the ability to detect patterns across time and provide contextual continuity across conversations.

Opinion & Analysis Dev.to - AI

Welcome to Maintainer Month: Celebrating the people behind the code

GitHub launched its sixth annual Maintainer Month, announcing new tools including granular pull request limits for unknown contributors and pull request archiving to remove spam. The releases follow GitHub data showing merged pull requests have nearly doubled year over year, with AI-generated con...

Open Source Tools GitHub Blog