// edition · 2026-04-25

April 25, 2026

28 stories on AI dev tools, agents, and the coding stack, curated from the day's RSS haul by Agentic Dev's pipeline.

Top Signal · CLI Agents

Your AI agent already writes every session to disk. Why isn't it reading its own archive?

A developer built `claude-recall`, a tool that indexes Claude Code's JSONL session archives into SQLite with FTS5 full-text search and injects relevant prior sessions into new prompts via a `UserPromptSubmit` hook. The tool optionally uses a local Ollama embedding model for semantic reranking, wi...

Dev.to - Claude

Tool Updates

Prompt Caching in 2026: Anthropic vs OpenAI vs Gemini for Production Apps

Anthropic, OpenAI, and Google Gemini each offer prompt caching with differing TTLs, pricing, and invalidation rules; Anthropic's implementation uses explicit cache_control breakpoints with 5-minute or 1-hour TTLs, reducing a 200,000-token prompt from roughly $0.60 to $0.06–$0.08 per request. At 1...

Workflows & Tips Dev.to - AI

Four failure modes you'll hit running a local LLM in a multi-step agentic loop

A developer testing seven local LLMs across two local inference servers documented four failure modes that occur in multi-step agentic loops using MCP tool calls, including infinite tool-call repetition where models fail to recognize task completion.

Agent Engineering Dev.to - Claude

Stop Generating AI Slop: The Ultimate Workflow for Coding with Claude Code

A developer published a three-stage workflow for using Anthropic's Claude Code that requires AI to first produce written research and implementation plans in Markdown files before generating any code. The approach separates analysis, planning, and execution to reduce unreviewed code output.

CLI Agents Dev.to - Claude

Why Claude needs a real environment to validate cloud-native code

Boris Cherny, creator of Claude Code, stated that giving Claude a way to verify its own work produces 2-3x better results, calling it more important than ever with the Opus 4.7 release. OpenAI Codex, GitHub Copilot, and Cursor have each shipped self-validation loops in the past six months as a co...

Agent Engineering The New Stack

Multi-Agent vs Single-Agent Architecture in 2026: When the Crew Beats the Soloist

A developer describes building three multi-agent LLM systems in 2024, finding two would have performed better as single-agent systems with multiple tools. The article outlines four multi-agent patterns — sequential pipeline, specialist crew, debate loop, and shared-state swarm — and argues single...

Agent Engineering Dev.to - AI

Cost-engineering an "AI Generate" button in a freemium product (from $0.08 to $0.029 per click)

A developer building a coding interview prep app called Crackly reduced the per-click API cost of an AI visualization feature from $0.08 to $0.029 by implementing tiered call paths, prompt caching, output token caps, a cheaper gatekeeper model, and a Groq fallback, while routing free-tier users t...

Workflows & Tips Dev.to - Claude

Cursor and Chainguard partner to lock down the AI agent supply chain

Cursor and Chainguard announced a partnership that integrates Chainguard's hardened container images and verified artifact catalog into Cursor's AI coding workflow. When Cursor's agents resolve dependencies, they can now pull from Chainguard's catalog instead of public registries such as PyPI, np...

Agentic IDEs The New Stack

Structured Outputs in 2026: Function Calling, JSON Mode, and the Schema Wars

As of 2026, LLM providers offer three distinct structured output methods: JSON mode (syntax validation only), function calling (soft schema constraints), and schema-constrained generation (hard token-level enforcement that prevents schema violations). OpenAI, among other providers, offers strict ...

Agent Engineering Dev.to - AI

How I Stopped My AI Agent From Reinventing the Wheel

A developer built an OpenClaw plugin called "openclaw-skill-hunter" that instructs AI agents to search for existing tools before generating custom code. In a 150-task test, the developer found 40% of tasks involved reimplementing functionality already available in existing tools.

Agent Engineering Dev.to - Claude

Claude Haiku 4 API: The Budget Developer's Guide to Production-Grade AI

Anthropic's Claude Haiku 4 is priced at $1 per million input tokens and $5 per million output tokens, making it 5x cheaper than Opus 4.7. It scores 78.2% on MMLU and 72.5% on HumanEval, but trails Opus by 36 percentage points on vision benchmarks.

Pricing & Plans Dev.to - Claude

How to Build Your First AI Agent in 2026: A Practical Guide

Dev.to published a beginner-oriented tutorial on building AI agents, covering the practical steps involved in constructing a basic agent system as of 2026.

Workflows & Tips Dev.to - AI

Free AI Tools That Replace $500/Month Subscriptions in 2025

Several free AI tools offer alternatives to paid services in 2025, including Google Gemini (1,500 requests/day), Groq (14,400 requests/day), Hugging Face (30,000+ models), Stable Diffusion, and Ollama, all of which can be used at no cost.

Pricing & Plans Dev.to - AI

What Is Mascot Engine? A Practical System for Building Interactive AI Mascots in Real Products

Mascot Engine is a framework for embedding interactive animated mascots into Web, Flutter, and Unity applications, using Rive state machines to tie character animations to application states and AI service responses. The system combines vector character assets, state-driven animation, and integra...

Agent Engineering Dev.to - AI

Ecosystem

Your .claude/ Directory Is Now a Supply Chain Target

The @bitwarden/cli npm package version 2026.4.0, compromised on April 22, 2026, contained malware that specifically targeted AI coding tool credentials from six tools including Claude Code, Gemini CLI, and Codex CLI, according to JFrog security researchers. The malware stole authentication files ...

Industry & Funding Dev.to - AI

GPT-5.5 prompting guide

OpenAI released GPT-5.5 via its API alongside a prompting guide that advises developers to treat it as a new model family rather than a drop-in replacement for gpt-5.2 or gpt-5.4. The guide recommends starting with a minimal prompt baseline and retuning reasoning effort, verbosity, and output for...

Model Releases Simon Willison

GPT-5.5 vs Claude Opus 4.7 vs Gemini 3.1 Pro: The Frontier Model Showdown

A benchmark comparison of GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro found split results: GPT-5.5 led Terminal-Bench 2.0 at 82.7%, Opus 4.7 led SWE-Bench Pro at 64.3% and MCP-Atlas tool-use at 77.3%, and Gemini 3.1 Pro led ARC-AGI-2 abstract reasoning at 77.1%.

Model Releases Dev.to - Claude

GPT 5.5 on AI Gateway

OpenAI's GPT-5.5 and GPT-5.5 Pro models are now accessible through Vercel's AI Gateway, available via the identifiers `openai/gpt-5.5` and `openai/gpt-5.5-pro` in the AI SDK. Both variants target long-running agentic tasks and are described as more token-efficient than the previous generation.

Model Releases Vercel Blog

llm 0.31

Simon Willison released version 0.31 of his open-source `llm` CLI tool, adding support for OpenAI's GPT-5.5 model, a verbosity level option for GPT-5+ models, and an image detail level parameter for image attachments.

Open Source Tools Simon Willison

“Mythos-like hacking, open to all”: Industry reacts to OpenAI’s GPT 5.5

OpenAI released GPT-5.5 and GPT-5.5 Pro, general-purpose models with claimed improvements in coding and reasoning. Early testing by developer Simon Willison found the model performed below GPT-5.4 on default settings, improving only when given higher reasoning effort at the cost of increased toke...

Model Releases The New Stack

How I got my AI agents to communicate across repos — and shipped SAMP doing it

A developer released SAMP (Simple Agent Message Protocol) and a reference implementation called "agent-message," enabling AI coding agents to pass messages across separate repository sessions using append-only JSONL log files with no servers or daemons. The system uses content-addressed message I...

Open Source Tools Dev.to - Claude

DeepSeek previews new AI model that ‘closes the gap’ with frontier models

DeepSeek previewed new AI models it says outperform DeepSeek V3.2 in efficiency and performance, citing architectural improvements. The company claims the models have nearly matched leading open and closed models on reasoning benchmarks.

Model Releases TechCrunch - AI

Mistral’s Leanstral wants to kill off human-in-the-loop code checks, but is it blowing in the wind?

Mistral AI launched Leanstral in March, an open-source code agent that uses formal verification via the Lean 4 programming language to mathematically prove code correctness. The model uses a Mixture-of-Experts architecture with 119 billion total parameters and 6.5 billion active parameters, relea...

Open Source Tools The New Stack

Cancelled Claude AI Agent: My 4 Reasons For The Switch

A developer discontinued use of Anthropic's Claude models across production systems, citing declining output quality, higher token costs, inconsistent API latency, and reduced tool-call reliability in claude-3-sonnet-20240229. Specific degradation included a trading system's false-positive sell s...

Opinion & Analysis Dev.to - Claude

The Hidden Debt in AI-Assisted Code (And How to Stop Accumulating It)

Developers using AI coding assistants risk accumulating "AI debt" — functional but poorly understood code that becomes difficult to maintain when requirements change or bugs emerge. Proposed mitigations include narrowing request scope per session, reviewing generated code for comprehension rather...

Opinion & Analysis Dev.to - Claude

BuyWhere MCP: Give Claude Desktop Live Singapore Prices in 2 Minutes

BuyWhere launched an MCP server that connects Claude Desktop to live retail pricing data from 20+ Singapore retailers, including Harvey Norman, Shopee, and Lazada, covering over 1,000 products. The free tier allows 500 API requests per month.

MCP & Integrations Dev.to - Claude

Vectors gave us AI search, tensors are going to make it smarter

Tensors, which are multi-dimensional extensions of vectors, can improve AI search by enabling better relevance ranking and multimodal retrieval compared to standard one-dimensional vector embeddings. Unlike vectors, tensors can represent information along multiple axes, allowing search systems to...

Opinion & Analysis The New Stack

Apple’s new CEO, and why Elon Musk wants to buy Cursor for $60B

Apple CEO Tim Cook plans to step down in September, with hardware chief John Ternus set to succeed him. Separately, Elon Musk has reportedly expressed interest in acquiring AI code editor Cursor at a $60 billion valuation.

Industry & Funding TechCrunch - AI

April 25, 2026

Tool Updates

Ecosystem

Adjacent editions