CLI Agents — Agentic Dev | AI Dev Tools News

How to write a Claude Code skill (and the gotchas the docs skip)

Claude Code skills are markdown files stored in a folder at `~/.claude/skills/<name>/SKILL.md`, requiring only a name and description in YAML frontmatter to function. The description field acts as the trigger Claude matches against user requests to determine which skill to activate.

Dev.to - Claude · 2026-06-02

Harness: Turn a One-Line Prompt Into a Full Agent Team for Claude Code

Harness is a Claude Code plugin that generates multi-agent team scaffolding — including agent definitions, skill files, and orchestration logic — from a single plain-English prompt. It selects from six architecture patterns and is available via the Claude Code plugin marketplace or GitHub at revf...

Dev.to - Claude · 2026-06-02

Pasted File Editor

Simon Willison built a browser-based "Pasted File Editor" tool using Codex desktop, modeled after Claude.ai's feature that converts large text pastes into file attachments. The tool supports direct file opening, image thumbnails, and drag-and-drop onto a textarea.

Simon Willison · 2026-06-02

Claude Code Dynamic Workflows: A Hands-On Guide for Developers (2026)

Anthropic added dynamic workflows to Claude Code on May 28, 2026, allowing Claude to write JavaScript scripts that orchestrate up to 1,000 subagents on a single task while running in the background independent of the user's active session.

Dev.to - Claude · 2026-06-01

Claude Code Commands Beginner’s Handbook

A developer published a reference guide covering CLI commands, flags, and in-session slash commands for Anthropic's Claude Code tool, which is installed via npm as @anthropic-ai/claude-code. The guide covers session management, one-shot mode, piping file contents, authentication, and in-prompt sy...

Dev.to - Claude · 2026-06-01

The best Claude Code agents are defined by what they refuse to do

A developer published a method for writing Claude Code subagents centered on explicit "refusal lists" — instructions defining what the agent must not do — arguing these constraints prevent LLMs from producing bloated, unfocused output. The approach is illustrated with a pre-merge diff checker tha...

Dev.to - Claude · 2026-05-31

How Braintrust turns customer requests into code with Codex

Braintrust engineers use OpenAI's Codex with GPT-4.5 to convert customer requests into code and run experiments faster, according to a case study published by OpenAI.

OpenAI Blog · 2026-05-30

How a Claude Code Plugin Racked Up 200K GitHub Stars — What ECC Teaches Us About AI Coding in 2026

Developer Affaan Mustafa open-sourced "Everything Claude Code" (ECC), a plugin for Claude Code containing 63 specialized agents, 249 skills, and 79 command shims, which accumulated approximately 200,000 GitHub stars. ECC originated from a workflow Mustafa built during an Anthropic and Forum Ventu...

Dev.to - Claude · 2026-05-29

3 weeks, 0 Rust, 1 shipped app: what worked with Claude Code for a C++ dev.

A C++ developer with no prior Rust experience built and shipped a desktop photo editor in three weeks using Tauri v2, ONNX Runtime with CUDA, and four ML models, relying on Claude Code for code generation. The developer also ported an IAT exposure-correction model to ONNX format and published it ...

Dev.to - Claude · 2026-05-29

Claude Code Slash Commands You Should Know (I wasn't either)

Claude Code includes slash commands for session and context management, including /resume to continue prior sessions, /branch to fork conversations, /diff to review changes, /compact to compress context, and /security-review to audit code before deployment.

Dev.to - Claude · 2026-05-28

Getting Started with Claude Code: Your First AI Coding Partner

Anthropic's Claude Code is a command-line interface for AI-assisted software development that reads and writes files, executes commands, manages Git workflows, and reasons across up to 1 million tokens of codebase context. It runs on Claude Sonnet 4.6 by default for Pro users and Opus 4.6 on Max ...

Dev.to - Claude · 2026-05-28

Building OpenCode with Dax Raad

OpenCode, an open-source AI coding tool co-founded by Dax Raad, grew from approximately 650,000 to nearly 8 million monthly active users within a few months, alongside nearly 1 million daily active users. After Anthropic blocked integration with Claude Code, OpenCode pursued partnerships with Ope...

Pragmatic Engineer · 2026-05-28

I built a CLI that scaffolds agentic workflows for Claude Code

A developer released AgentKit, a CLI tool published as @patricksardinha/agentkit-cli on npm, that generates markdown orchestration files to structure multi-agent workflows for Anthropic's Claude Code. The tool requires no API key and works by reading a plain-language project blueprint to produce ...

Dev.to - Claude · 2026-05-28

How Claude Code Thinks: Inside Your AI Coding Assistant

Anthropic's Claude Code processes code as text through tokenization and pattern matching, without executing it. Current models include Claude Sonnet 4.6 and Opus 4.6/4.7 with 1M-token context windows, and Claude Haiku 4.5 with 200K tokens; the Claude 3 Haiku model has been retired.

Dev.to - Claude · 2026-05-28

Building self-improving tax agents with Codex

OpenAI, Thrive, and Crete built a tax agent using Codex that automates tax filings and incorporates self-improvement mechanisms to increase accuracy and speed up workflows.

OpenAI Blog · 2026-05-28

10 Claude Code Skills That Actually Work (Free Download)

A developer published a free collection of 50 reusable Claude Code prompts, sharing 10 examples covering tasks such as automated commit message generation, code review with severity tagging, bug investigation, and test generation.

Dev.to - Claude · 2026-05-27

I got tired of Claude Code asking me to explain my project architecture every morning — so I built this

A developer built Pandaibesy, an offline Python CLI tool that stores and retrieves project decisions across Claude Code sessions using three commands: capture, query, and mcp-pull. The tool requires no API keys or installation beyond cloning the repository and runs on Python 3.13, including on An...

Dev.to - Claude · 2026-05-27

Claude Code Video Skills: A Developer's Practical Guide to All 6 Options (2026)

Claude Code supports six video generation integrations — Remotion, HeyGen, inference.sh, Pexo, Higgsfield, and digitalsamba's Video Toolkit — each with distinct architectures ranging from React-to-MP4 rendering to AI avatar generation and multi-model inference gateways. Remotion leads with over 1...

Dev.to - Claude · 2026-05-27

How I manage 40+ skills across Claude Code, Codex, and .agents folders

A developer built a Go CLI tool to manage selective loading of 40+ skills across AI coding agents including Claude Code and Codex, after global skill directories caused irrelevant tool suggestions across different project contexts. Prior fixes including manual file moves and shell aliases failed ...

Dev.to - Claude · 2026-05-26

Vibe Coding: My Daily Workflow with Claude Code

A developer published their workflow for using Claude Code in AI-assisted ("vibe") coding, describing a hands-on approach where they write project plans independently before consulting the model, use separate context files to reduce token usage, and validate AI-suggested bug diagnoses before allo...

Dev.to - Claude · 2026-05-26

50 Claude Code Skills That Paid Developers Don't Talk About

A developer published a collection of Claude Code CLI prompt templates covering tasks such as generating conventional commit messages, pre-push code review, codebase mapping, refactoring planning, and root cause analysis. A full pack of 50 prompts spanning git, security, DevOps, and architecture ...

Dev.to - Claude · 2026-05-25

Your Claude Code hooks probably fail open — here's why that's dangerous

Claude Code hooks that swallow exceptions and exit with code 0 default to allowing commands when they fail, a pattern the author calls "failing open." A npm package called `claude-hook-guard` was published to enforce fail-closed behavior, adding audit logging, a 5-second timeout, and a bypass tok...

Dev.to - Claude · 2026-05-25

How to use Claude in vscode?

A tutorial describes how to configure the Claude Code VSCode extension to use Zhipu AI's Anthropic-compatible API endpoint by creating a config.json file in the ~/.claude directory and setting ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN environment variables in VSCode settings.

Dev.to - Claude · 2026-05-25

How I Indexed 2,000 Claude Code Skills (And What the Install Data Says About AI Coding in 2026)

A developer built a searchable index of 1,998 public Claude Code skills at orangebot.ai/skills, pulling install data from the skills.sh registry and sorting entries by weekly install volume. The catalog uses a static JSON file updated by a daily Python scraper, served via Next.js on Firebase App ...

Dev.to - Claude · 2026-05-24

I reproduced a Claude Code RCE. The bug pattern is everywhere.

Security researcher Joernchen discovered a remote code execution vulnerability in Claude Code version 2.1.118, which Anthropic has since patched. The flaw stemmed from a parsing anti-pattern that the author argues is common across AI developer tools.

Dev.to - Claude · 2026-05-23

I read the X (Twitter) algorithm source for 4 days and built a Claude Code sub-agent that scores drafts before posting

A developer analyzed Twitter's open-source recommendation algorithm repositories and built a Claude Code sub-agent to score tweet drafts pre-posting. The algorithm's heaviest positive signal is author-engaged replies (+75), while mutes or blocks carry a -74 penalty, nearly canceling 150 likes.

Dev.to - Claude · 2026-05-23

How Virgin Atlantic ships faster with Codex

Virgin Atlantic used OpenAI's Codex to ship a revamped mobile app on a fixed holiday travel deadline, achieving near-total unit test coverage and zero P1 defects in the release.

OpenAI Blog · 2026-05-23

Writing Your Own Claude Code Skill in 2026: The Practical Guide

Claude Code supports a custom skills system where users create SKILL.md files in a plugin directory to store reusable instructions that trigger automatically based on user prompts. Each skill consists of a directory with a SKILL.md file containing frontmatter metadata and a body with instructions.

Dev.to - Claude · 2026-05-22

How Ramp engineers accelerate code review with Codex

Ramp engineers use OpenAI's Codex with GPT-4.5 to conduct code reviews, reducing feedback time from hours to minutes. The company has integrated the tool into its development workflow to automate substantive code analysis.

OpenAI Blog · 2026-05-21

I built nvm-style account manager for Claude Code

A developer released "claudenv," a shell-based account manager for Claude Code that uses the `CLAUDE_CONFIG_DIR` environment variable to switch between multiple Claude accounts. It supports global defaults, per-project `.claudenvrc` files, and auto-switching on directory change, requiring only ba...

Dev.to - Claude · 2026-05-21

How to use Claude Code like you’ve used it for a year

A developer with nearly a year of Claude Code experience published a guide covering session management techniques, including when to use /compact versus /clear, how subagents protect main context, and why hooks are more reliable than memory for consistent behavior.

Dev.to - Claude · 2026-05-20

Agentic app coding gets an upgrade with Google’s release of Android CLI

Google released Android CLI tools compatible with AI coding agents such as Claude Code and OpenAI's Codex, enabling developers to build Android apps from the command line using automated agents.

TechCrunch - AI · 2026-05-20

Claude Code a Fondo: La Guía Definitiva para Multiplicar tu Productividad como Desarrollador

A developer guide published on Dev.to outlines Claude Code's extensibility features, including CLAUDE.md configuration files, custom slash commands, hooks, subagents, and MCP plugins. The guide describes how each layer functions and when to use it within a development workflow.

Dev.to - Claude · 2026-05-19

Claude Code Deep Dive: The Definitive Guide to 10x Your Developer Productivity

Claude Code, Anthropic's AI coding assistant, supports a layered extensibility system including CLAUDE.md project configuration files, custom slash commands, hooks, subagents, and MCP server integrations. A developer guide outlines how these features can be combined to automate code review, testi...

Dev.to - Claude · 2026-05-19

The Claude Code Plugin Marketplace: How Skills, MCP Servers, and Plugins Actually Fit Together in 2026

Anthropic's Claude Code plugin marketplace, as of May 2026, organizes extensions into four components: plugins (the distribution format), skills (Markdown-based prompt instructions), MCP servers (external data layer processes), and hooks (shell commands triggered by lifecycle events). The officia...

Dev.to - Claude · 2026-05-19

Cómo Reconstruimos la Identidad Visual de Guayoyo Tech Usando Skills de Diseño para Agentes de IA

Guayoyo Tech rebuilt its website visual identity in under 48 hours using Claude Code as the design agent, modifying 62 files across 30 commits. The redesign replaced a generic cyan/Inter aesthetic with an amber palette, three-font editorial typography, asymmetric layouts, and dark/light mode sema...

Dev.to - Claude · 2026-05-19

Claude Code Skills Pack — 15 Production Skills, 5 Hooks, 7 Slash Commands ($99 lifetime)

A developer released a $99 bundle of Claude Code configuration assets, including 15 workflow skills, 5 automation hooks, 7 slash commands, and a CLAUDE.md template, sold as a single-developer lifetime license with a 7-day refund policy.

Dev.to - Claude · 2026-05-19

Two Multi-Account Claude Code Architectures: One Anthropic Accepts, One They Ban

Anthropic permits running multiple Claude Code accounts via separate config directories (CLAUDE_CONFIG_DIR), acknowledged in GitHub issue #261, but bans relay-server tools like Wei-Shaw/claude-relay-service that pool OAuth tokens behind a proxy endpoint to distribute requests across subscription ...

Dev.to - Claude · 2026-05-17

The Claude Code Regression Rerouted My Flutter Workflow. The 4-Tool AI Stack I Use Now.

Anthropic's April 23 postmortem disclosed three bugs in Claude Code that degraded output quality over six weeks, including a reasoning downgrade from high to medium, a caching defect that pruned chain-of-thought mid-session, and a verbosity instruction linked to a 3% eval drop.

Dev.to - Claude · 2026-05-17

Claude Code VSCode Notifications Are Broken — Here's My Windows Workaround

A developer documented a workaround for broken notifications in the Claude Code VSCode extension (v2.1.143) on Windows 11, using Claude Code's built-in hook system to trigger PowerShell scripts that play sounds and display alerts when processing completes or a confirmation prompt appears.

Dev.to - Claude · 2026-05-17

Optimizing your Claude Code usage (and spending less $$)

TokenJam released a feature called "tj optimize" that reads Claude Code's local JSONL session logs into a DuckDB database, identifies sessions that could use smaller models, and projects monthly API spending against a user-defined budget.

Dev.to - Claude · 2026-05-16

Code Quest: A Claude Code Web UI That Runs in Interactive Mode — Just in Time for the June 15 Billing Change

A developer released code-quest, an open-source web UI for Anthropic's Claude Code CLI that runs in interactive mode via a three-tier WebSocket architecture. The project notes that starting June 15, 2026, Anthropic will bill `claude -p` and Agent SDK usage from separate monthly credits rather tha...

Dev.to - AI · 2026-05-16

Why every Claude Code-built site looks the same — and the image layer that breaks it

A developer released a Claude Code skill that calls OpenAI's gpt-image-2 via Codex CLI to generate project-specific images, aiming to reduce the visual uniformity common to AI-built sites using default Tailwind, shadcn/ui, and Lucide icon stacks. The tool reads a DESIGN.md file and triggers on na...

Dev.to - Claude · 2026-05-16

How data science teams use Codex

OpenAI published guidance on how data science teams can use Codex to automate analytical outputs including root-cause briefs, KPI memos, impact readouts, scoped analyses, and dashboard specifications from existing work inputs.

OpenAI Blog · 2026-05-16

How business operations teams use Codex

OpenAI published a guide showing how business operations teams can use Codex to generate documents such as initiative briefs, strategy updates, leadership decision packets, and progress updates from existing work inputs.

OpenAI Blog · 2026-05-16

2026 Best Claude Code Skills: Top 8 Skills, Setup Guide & Use Cases

Claude Code Skills are folders containing SKILL.md files that provide Claude Code with reusable workflow instructions, loading approximately 100 tokens per skill at session start and expanding only when triggered. A Dev.to guide outlines eight such skills for 2026, including Frontend Design, Play...

Dev.to - Claude · 2026-05-15

Codex Now Works from Your Phone — Plus Hooks and CI/CD Tokens

Dev.to - Claude · 2026-05-15

OpenAI says Codex is coming to your phone

OpenAI announced that Codex, its AI-powered coding tool, will be made available on mobile devices, giving users the ability to manage coding workflows from their phones.

TechCrunch - AI · 2026-05-15

OpenAI’s Codex is now in the ChatGPT mobile app

OpenAI is adding Codex, its AI coding assistant, to the ChatGPT mobile app on iOS and Android. The move follows a recent Codex update that enabled it to operate apps on macOS, as OpenAI competes with Anthropic's Claude Code.

The Verge - AI · 2026-05-15

OpenAI brings Codex to the ChatGPT mobile app

OpenAI added Codex to the ChatGPT mobile app on iOS and Android, where it connects to a desktop machine running Codex via a relay layer rather than operating as a standalone product. The feature is rolling out to all Codex users, including free and Go plan subscribers, with macOS support only for...

The New Stack · 2026-05-15

Work with Codex from anywhere

OpenAI added Codex access to the ChatGPT mobile app, allowing users to monitor, steer, and approve coding tasks remotely across devices and remote environments.

OpenAI Blog · 2026-05-15

Agent Poke: Scheduled Check-ins for Codex and Claude Code

A developer released "agent-poke," an open-source Docker-based tool that sends scheduled "Hey!" messages to OpenAI Codex CLI and Claude Code at four fixed times daily to trigger subscription usage windows. The tool uses cron jobs and requires manual login via each CLI's official auth flow.

Dev.to - Claude · 2026-05-15

Sea's View on the Future of Agentic Software Development with Codex

Sea Limited is deploying OpenAI's Codex across its engineering teams in Asia, according to the company's Chief Product Officer. The move aims to accelerate AI-native software development across Sea's operations.

OpenAI Blog · 2026-05-15

Claude Code Ultraplan: Cloud-Based AI Planning in 2026 — A Hands-On Tutorial

Anthropic's Claude Code Ultraplan, described as a research preview, separates the planning phase from code execution by offloading plan drafting to a cloud session, allowing users to review and comment on plans in a browser before execution. The feature requires Claude Code v2.1.91 or later and i...

Dev.to - Claude · 2026-05-14

⚽️ Claude Code Isn’t the Only Game in Town

Several AI coding agents compete with Anthropic's Claude Code, including OpenAI's Codex, which offers built-in browser access and cloud environments, and openCode, an open-source alternative. Most offer free tiers, and the tools vary in form factor between CLI, TUI, and full applications.

Dev.to - Claude · 2026-05-14

Anthropic’s Claude Code agent view is a better dashboard. So why aren’t developers convinced?

Anthropic released an "agent view" dashboard for Claude Code that lets developers monitor and manage multiple AI coding sessions from a single CLI interface, showing session status and enabling inline replies. Developer reactions are mixed, with some welcoming the centralized view while others ar...

The New Stack · 2026-05-14

Building a safe, effective sandbox to enable Codex on Windows

OpenAI built a secure sandbox for its Codex coding agent on Windows, implementing controlled file access and network restrictions to allow safe execution of automated coding tasks.

OpenAI Blog · 2026-05-14

How finance teams use Codex

OpenAI published a guide describing how finance teams can use its Codex coding assistant to automate tasks including monthly business reviews, reporting packs, variance bridges, model checks, and planning scenarios.

OpenAI Blog · 2026-05-13

7 Claude Code Routines That Actually Save Me Hours Each Week

Claude Code's Routines feature lets users configure automated cloud-based jobs that run on schedules, API triggers, or GitHub events without requiring a local machine. Usage limits are 5 routines per day on Pro, 15 on Max, and 25 on Team and Enterprise plans.

Dev.to - Claude · 2026-05-13

Claude Code Dreaming - What /dream Actually Does for Your Memory

Anthropic's Claude Code includes a memory consolidation feature called Dreaming, which reads project memory files, removes stale entries, and merges duplicates into a condensed version. It runs automatically after roughly 24 hours and five sessions of inactivity, or can be triggered manually with...

Dev.to - Claude · 2026-05-13

You Can Probably Use Claude Code for Free at Work

Claude Code can be configured to route requests through Microsoft Azure AI Foundry by setting environment variables and authenticating via Azure CLI, allowing users whose employers already have Claude models deployed on Foundry to avoid a $17/month personal subscription fee.

Dev.to - Claude · 2026-05-13

How NVIDIA engineers and researchers build with Codex

NVIDIA engineers and researchers use OpenAI's Codex with GPT-5.5 to build production systems and conduct research experiments. The collaboration covers both software development and experimental research workflows.

OpenAI Blog · 2026-05-13

OpenAI Codex vs Claude Code: Hands-On Python Benchmark for Devs

A benchmark pitting OpenAI Codex against Anthropic's Claude Code on identical Python tasks found Claude Code completed refactoring in roughly four minutes versus Codex's seven, and produced cleaner bug fixes on first attempts. Codex generated more extensive refactors with larger diffs; both tools...

Dev.to - AI · 2026-05-12

I built a CLI to view your effective Claude Code config across all 4 scopes

A developer released `cc-config-viewer`, a CLI tool that displays the effective Claude Code configuration across all four scopes (Managed, User, Project, Local) for the current session. It runs without installation via `npx cc-config-viewer@latest` and uses the official Claude Code JSON Schema.

Dev.to - Claude · 2026-05-12

I Built a Skin System for Claude Code — Here's How It Works

A developer built a skin system for Claude Code that adds nine visual themes to the terminal interface, each with custom colors, ASCII banners, tool sounds, and narration styles. The system runs on bash using Claude Code's SessionStart, SessionEnd, and PostToolUse lifecycle hooks with YAML config...

Dev.to - Claude · 2026-05-12

🚀 I built askdiff — a Claude Code skill that lets you ask questions to the same session that wrote the code

A developer released "askdiff," an open-source NPM package and Claude Code skill that opens a diff viewer in the browser linked to the same Claude Code session that wrote the code. It is installable via `npx askdiff install-skill` and requires no Anthropic API key.

Dev.to - Claude · 2026-05-12

Why 157,000 developers are hedging against Anthropic with OpenCode

SST's OpenCode, an open-source coding agent, has accumulated 157,000 GitHub stars — surpassing Anthropic's own Claude Code repository at 122,000 — after Anthropic blocked third-party OAuth authentication to Claude Pro and Max subscriptions in January 2026 without advance notice.

The New Stack · 2026-05-11

Claude Code Source Analysis Series, Chapter 4: Context Management

A developer series analyzing Claude Code's source code covers context management in its fourth installment, explaining that the coding agent rebuilds its full model request each turn from system rules, tool descriptions, message history, and compressed summaries, since the underlying model is sta...

Dev.to - Claude · 2026-05-10

Claude Code Source Analysis Series, Chapter 6: Tools Overview

A source code analysis of Claude Code describes its tool system architecture, in which a `Tool.ts` contract requires each tool to declare parameters, permissions, read-only status, and concurrency before the runtime executes any model-requested action.

Dev.to - Claude · 2026-05-10

Using Claude Code: The unreasonable effectiveness of HTML

Simon Willison published a piece arguing that HTML is a highly effective output format when using Claude Code, Anthropic's CLI coding tool, with accompanying examples hosted on GitHub Pages.

Hacker News - Best · 2026-05-10

Claude auf Colossus: Musk-Deal verdoppelt Code-Limits

Anthropic has rented 300 megawatts of compute capacity from xAI's Colossus 1 datacenter, nearly 70% of its total capacity, as Claude usage grew 80-fold annually in Q1 2026. The deal doubled the 5-hour usage limit in Claude Code and expanded API rate limits, with Tier 1 output tokens rising from 8...

Dev.to - Claude · 2026-05-10

Build a Claude Code plugin for SDD workflows that’s actually different – and it can survive interruptions, like, for real

A developer published CodeSpec, a Claude Code plugin for spec-driven development workflows, on GitHub. The plugin persists state across six stages—spec, clarify, plan, tasks, implement, and review—allowing sessions to resume after interruptions without reloading full context.

Dev.to - Claude · 2026-05-10

🧠 I Tried 100 Claude Skills. These Are The Best.

Anthropic's Claude Code platform added a feature called Agent Skills — modular capability packs defined by a directory containing a SKILL.md file — enabling Claude to load context and scripts on demand rather than at startup. Claude Code runs on terminal, IDE, desktop, web, iOS, and Slack, using ...

Dev.to - Claude · 2026-05-10

OpenAI Codex arrives in the browser with new Chrome extension

OpenAI released a Chrome extension for its Codex product that allows agents to operate within a user's existing browser session, accessing signed-in sites, cookies, and authenticated workflows across multiple tabs. The extension connects Chrome to the Codex desktop app on Windows and macOS, enabl...

The New Stack · 2026-05-09

I Built an Issue-Based Claude Code Plugin "cadenza" for Technical Output Creation

A developer released "cadenza," an open-source Claude Code plugin that structures technical writing into five sequential phases — issue finding, decomposition, storyboarding, verification, and output generation — with gate checks that prevent skipping steps. The plugin outputs a Markdown file and...

Dev.to - Claude · 2026-05-09

“The terminal still matters”: Amp rebuilds its CLI for an agentic future beyond the command line

Amp, an AI coding startup that spun out of Sourcegraph in late 2025, released a rebuilt CLI called Neo, redesigned to support remote control of terminal sessions via a web interface, plugins, and longer-running agent workflows. Developers can start a local CLI session and manage it remotely, incl...

The New Stack · 2026-05-09

Claude Code vs Hiring a Developer in 2026: $20 CLI or $80K Engineer?

A blog post compares Anthropic's Claude Code CLI tool, priced at $20–200/month, against the cost of hiring a software developer at approximately $80,000 per year, concluding the tool functions as a developer aid rather than a full replacement.

Dev.to - Claude · 2026-05-09

I tested the new OpenAI Codex features on a real Python codebase, and it’s the strongest Claude Code rival yet

OpenAI's "Codex for (almost) everything" update added an in-app browser, computer use, PR review, SSH connections, and over 90 plugins to its desktop app, used by more than 3 million developers weekly. In testing against the HTTPie Python codebase, Codex read a GitHub issue, traced a bug to three...

The New Stack · 2026-05-08

Stop hand-syncing Claude Code and Codex configs

A developer released ai-config-sync-manager v0.1.0, a Node.js CLI tool that synchronizes configuration files between Claude Code and OpenAI Codex, translating between their differing formats for agents, permissions, and MCP servers. The tool runs via npx and supports six sync areas with automatic...

Dev.to - Claude · 2026-05-08

OpenClaw and Claude can put your AI-generated podcasts in Spotify

Spotify released a command-line tool called "Save to Spotify" that lets AI agents such as Claude Code and OpenAI Codex upload AI-generated audio summaries and podcasts directly to a user's Spotify podcast feed. The tool is available on GitHub and is triggered by adding "save to Spotify" to an AI ...

The Verge - AI · 2026-05-08

Live blog: Code w/ Claude 2026

Anthropic held a "Code w/ Claude 2026" developer event featuring morning keynote sessions focused on Claude Code, its AI coding tool. Simon Willison live-blogged the event for simonwillison.net.

Simon Willison · 2026-05-07

Build a Custom Claude Code Statusline (with Rate Limits and a Bell on Done)

A developer published a tutorial for building a custom statusline in Claude Code using a shell script and jq, replacing the default display with fields showing context window usage percentage, 5-hour and 7-day rate limit consumption, and a terminal bell notification via a Stop hook in ~/.claude/s...

Dev.to - Claude · 2026-05-07

Claude Code Context Window Rot: Why Sessions Get Dumber (And How to Fix It)

A Chroma 2025 study of 18 frontier AI models, including Claude 4, GPT-4.1, and Gemini 2.5, found all performed worse as input length increased, with some dropping from 95% to 60% accuracy past a context saturation threshold. The effect, called "context rot," is more pronounced in coding agents be...

Dev.to - Claude · 2026-05-06

Which Claude Code Hook Do You Need? A Decision Guide

Claude Code supports four hook handler types — command, prompt, agent, and http — across 21 lifecycle events. Command hooks run in under 5ms and produce deterministic results, while prompt hooks invoke an LLM and take 300–2000ms, and agent hooks spawn full Claude Code sessions with file and tool ...

Dev.to - Claude · 2026-05-06

Stop prompting Codex like ChatGPT

A developer guide argues that OpenAI's Codex, an autonomous coding agent that reads repos and runs commands, performs better when given bounded "atomic" tasks with defined outcomes and verification steps rather than the open-ended conversational prompts suited to ChatGPT.

Dev.to - AI · 2026-05-06

Claude Code 2026 vs. Codeium 2.0: 45% Faster PR Reviews for Monorepo Codebases

A benchmark across 12 production monorepos (4.2M lines of code) found Claude Code 2026 reviewed TypeScript PRs 45% faster than Codeium 2.0 (12.4s vs 22.6s), while Codeium 2.0 was 22% faster for Java/Kotlin repos; Claude Code 2026 costs $149/seat vs $109 for Codeium 2.0.

Dev.to - Claude · 2026-05-06

Running Claude Code and Claude Desktop on Amazon Bedrock

Claude Code CLI and Claude Desktop can be configured to use Amazon Bedrock as the inference backend by setting environment variables in ~/.claude/settings.json and providing AWS IAM credentials, removing the need for a separate Anthropic API subscription.

Dev.to - Claude · 2026-05-04

Claude Code is powerful—but a black box: how much is it spending? looping? how much context is left? I built claudestat: a real-time dashboard with costs, tool calls, loop detection, and reports. npm i -g @deibygs/claudestat full visibility.

A developer released claudestat, an npm package that provides a real-time monitoring dashboard for Claude Code sessions, tracking token costs, tool calls, context usage, and detecting loops. It is installable via `npm i -g @deibygs/claudestat`.

Dev.to - Claude · 2026-05-04

6 Claude Code skills for indie hackers — with real output samples

A developer outlined six Claude Code workflow configurations for indie hackers: a pre-deploy shipping checklist, launch thread writer, support reply drafter, pricing page generator, architecture decision recorder, and competitor analysis tool. Each is designed to read actual codebase files and ou...

Dev.to - Claude · 2026-05-04

Codex for Claude Code Users: What to Know Before You Try It

A developer documented their shift from Claude Code to OpenAI's Codex for personal projects, citing lower cost and improved model quality around GPT-5.4 and later versions. The guide outlines differences in tooling, subscription plans, and CLI vs. desktop app usage for Claude Code users evaluatin...

Dev.to - Claude · 2026-05-04

DeepClaude – Claude Code agent loop with DeepSeek V4 Pro

DeepClaude is an open-source project that combines Anthropic's Claude Code agent loop with DeepSeek V3/V4 Pro, routing reasoning tasks through DeepSeek's model while using Claude for code execution and tool use.

Hacker News - Best · 2026-05-04

Claude Code vs Cursor for solo indie dev: an honest breakdown (I shipped 4 iOS apps to find out)

A developer compared Claude Code and Cursor over a 14-day sprint building 4 iOS apps across 11 repositories, finding Claude Code better suited for multi-repo and long-running tasks, while Cursor was faster to set up and better for React/Next.js frontend work.

Dev.to - Claude · 2026-05-04

I let Claude Code write an entire feature for a week. Here's what actually broke.

A developer used Claude Code to build a complete notification system (schema, API, queue worker, tests) for a Next.js/Postgres project over one week without writing code manually. A missing `await` keyword on an async call in the generated worker code caused random 4-second notification delays in...

Dev.to - Claude · 2026-05-03

Sightings

Simon Willison added a "Sightings" section to his blog that pulls in wildlife photos from his iNaturalist account, back-populating over a decade of observations totaling 208 entries. He built the feature using Claude Code on his phone as an extension of his existing content syndication system.

Simon Willison · 2026-05-03

Claude Code Forgets Everything Between Sessions. MEMORY.md Fixes That

Claude Code loses all session context on restart; a developer workflow using a MEMORY.md file in the project root provides a 200-line persistent index of evolving project state, such as recent migrations and active decisions, which Claude Code reads at session start.

Dev.to - Claude · 2026-05-02

Building a self-hosted deep-research agent with Claude Code

A developer released Scout, an MIT-licensed self-hosted research agent that uses Claude Code to convert GitHub Issues into cited research reports published to GitHub Pages. The system includes a pre-research "sharpening" step that clarifies ambiguous queries before dispatching parallel sub-agents...

Dev.to - Claude · 2026-05-02

I built Governor to reduce Claude Code token and context waste

A developer released Governor, an open-source Claude Code plugin that compresses memory files, filters build/test log output, and adds usage telemetry to reduce token consumption. Small local benchmarks reported approximately 55% reduction in output tokens and 96% of noisy pytest output blocked.

Dev.to - Claude · 2026-05-02

AI Coding Assistants vs. IDE Extensions: Claude Code 3.5 vs. Tabnine 2026 – 35% Faster Workflow

A Dev.to comparison of Claude Code 3.5 and Tabnine 2026 tested 50 developers on three coding tasks, reporting Claude completed work 35% faster on average, while Tabnine had lower completion latency (90ms vs. 120ms) and a 12% lower initial error rate.

Dev.to - Claude · 2026-05-02

Uber torches 2026 AI budget on Claude Code in four months

Uber depleted its entire 2026 AI budget within four months by spending it on Claude Code, Anthropic's AI coding tool. The rapid spending indicates unexpectedly high adoption or usage costs among Uber's engineering teams.

Hacker News - Best · 2026-05-02

Claude Code gave me no mirror. I built one.

Developer Robert Nowell released "skill-tree," a tool that analyzes Claude Code session history against 11 collaboration behaviors from Anthropic's February 2026 AI Fluency Index study, scoring users and assigning one of seven archetype cards. It is available as a Claude Code plugin and as an npm...

Dev.to - Claude · 2026-05-02

Codex CLI 0.128.0 adds /goal

OpenAI released Codex CLI version 0.128.0, adding a `/goal` command that causes the coding agent to loop repeatedly until it determines a user-set goal has been completed or a configured token budget is exhausted. The feature is implemented via two prompt templates injected automatically at the e...

Simon Willison · 2026-05-01

Anthropic’s Claude Security emerges from closed preview to scan your codebases for vulnerabilities

Anthropic released Claude Security, a codebase vulnerability scanning tool within Claude Code, from closed preview to public beta for Enterprise customers on Thursday. The tool uses multiple parallel agents to analyze data flows and includes a self-validation pipeline to reduce false positives, w...

The New Stack · 2026-05-01

Benchmark: Claude Code 2.5 vs Codeium 1.8 for Bug Detection Rate in Go 1.24 Unit Tests

A benchmark test on a 10,000-line Go 1.24 codebase with 412 injected bugs found Claude Code 2.5 detected 89.3% of defects versus 76.1% for Codeium 1.8, though Codeium processed files 50% faster and costs $20 less per seat monthly.

Dev.to - Claude · 2026-04-30

Running Claude Code in a Loop: The Script That Turns It Into a Persistent Agent

A developer tutorial describes wrapping Anthropic's Claude Code CLI in a bash loop to create a persistent polling agent, avoiding per-tick cold-start costs from MCP server handshakes and file re-reads that can add several seconds per interval.

Dev.to - Claude · 2026-04-29

AI Coding Tools Comparison 2026: Claude Code vs Cursor vs Gemini CLI vs Codex

A technical comparison of four AI coding tools finds Claude Code and Gemini CLI operate as terminal agents, Cursor integrates with VS Code, and Codex focuses on automated task execution. All four store session history as JSON files with no cross-project search or cross-session memory.

Dev.to - Claude · 2026-04-29

Where Is Claude Code Session History? How to Find Your AI Coding Conversations

Claude Code stores conversation history as JSON files in ~/.claude/projects/ (macOS/Linux) or %USERPROFILE%\.claude\projects\ (Windows), with filenames based on hash strings that carry no semantic information. The tool lacks a native interface for browsing or searching past sessions across projects.

Dev.to - Claude · 2026-04-29

Postmortem: Claude Code 3.5 Hallucination Caused $50k in Erroneous AWS Spend

A hallucinated Terraform configuration generated by Claude Code 3.5 caused a SaaS startup to incur $51,237 in AWS charges over 72 hours after the AI incorrectly specified 120 m5.24xlarge instances instead of 2 m5.large instances for an EKS node group.

Dev.to - Claude · 2026-04-28

How to Build a Self-Verification Loop in Claude Code (3 Layers, 20 Minutes)

A tutorial describes using Claude Code's Stop hook and PostToolUse hook to build a three-layer verification loop (syntax, intent, regression) that prevents the agent from completing until checks pass. The approach references a 13.7-point benchmark gain from LangChain's similar PreCompletionCheckl...

Dev.to - Claude · 2026-04-28

Claude Code is Gem

A software engineer with a background in trading systems at Bloomberg described shifting from skepticism about LLMs to regular use of Claude Code after finding that structuring interactions with proper context improved results more than prompt wording alone.

Dev.to - Claude · 2026-04-28

Claude Code + SonarQube Static Analysis: The AI Quality Loop is Finally Closed

SonarQube's static analysis tools can be integrated into Claude Code via a three-layer stack comprising sonarqube-agent-plugins, sonarqube-cli, and a containerized sonarqube-mcp-server. The integration requires SonarQube Server 10.x or later, as the MCP server calls the /api/v2/ endpoints not ava...

Dev.to - Claude · 2026-04-27

Claude Code Token Usage Hides in History and Tools

In Claude Code, token consumption is dominated by system prompts, conversation history, tool definitions, and CLAUDE.md project files rather than the user's typed input. Anthropic recommends keeping CLAUDE.md under 200 lines and notes that HTML block comments in that file are stripped before cont...

Dev.to - Claude · 2026-04-26

Hijacking OpenClaw with Claude

A developer described a method to connect OpenClaw, an open-source AI agent framework, to Claude by using the authentication built into the Claude Code CLI binary, bypassing the need for a separate API key or web account.

Dev.to - Claude · 2026-04-26

Your AI agent already writes every session to disk. Why isn't it reading its own archive?

A developer built `claude-recall`, a tool that indexes Claude Code's JSONL session archives into SQLite with FTS5 full-text search and injects relevant prior sessions into new prompts via a `UserPromptSubmit` hook. The tool optionally uses a local Ollama embedding model for semantic reranking, wi...

Dev.to - Claude · 2026-04-25

Stop Generating AI Slop: The Ultimate Workflow for Coding with Claude Code

A developer published a three-stage workflow for using Anthropic's Claude Code that requires AI to first produce written research and implementation plans in Markdown files before generating any code. The approach separates analysis, planning, and execution to reduce unreviewed code output.

Dev.to - Claude · 2026-04-25

An update on recent Claude Code quality reports

Anthropic confirmed that user complaints about degraded Claude Code performance over the past two months were caused by three separate bugs in the Claude Code harness, not the underlying models. One bug, introduced March 26, caused session memory to be cleared every turn after an idle period rath...

Simon Willison · 2026-04-24

How to get started with Codex

OpenAI published a guide for getting started with Codex, its AI-based coding agent, covering project setup, thread creation, and task completion. The guide is aimed at new users beginning to work with the tool.

OpenAI Blog · 2026-04-24

Beyond Drag-and-Drop: Automating n8n Workflows with Claude Code

A developer tutorial describes using Anthropic's Claude Code CLI to generate n8n workflow JSON files from natural language prompts, bypassing manual node configuration in n8n's visual editor. The approach involves prompting Claude to produce importable JSON that n8n can execute across its 400+ in...

Dev.to - Claude · 2026-04-24

Codex settings

OpenAI published documentation for configuring Codex settings, covering options for personalization, detail level, and permissions to customize how the AI coding agent runs tasks.

OpenAI Blog · 2026-04-24

Automations

OpenAI added an Automations feature to Codex that allows users to schedule and trigger automated tasks, including report generation, summaries, and recurring workflows without manual intervention.

OpenAI Blog · 2026-04-24

Anthropic published a postmortem on Claude Code. Here's what it means for developers building on Claude.

Anthropic published an engineering postmortem on April 23rd acknowledging quality regressions in Claude Code, its agentic coding product. The degradation occurred in the product's orchestration and prompting layer, not in the underlying Claude API, which remained unchanged.

Dev.to - Claude · 2026-04-24

What is Codex?

OpenAI offers Codex, an agent-based product designed to automate tasks, integrate with external tools, and generate outputs such as documents and dashboards, extending beyond standard chat interactions.

OpenAI Blog · 2026-04-24

Plugins and skills

OpenAI's Codex supports plugins and skills that allow users to connect external tools, access data sources, and define repeatable workflows to automate tasks.

OpenAI Blog · 2026-04-24

Claude Code for Team Workflows: How I Built a 90-Person AI Organization Without Hiring Anyone

A developer described building a system of 90 Claude Code agents organized across 7 departments with hierarchical routing, using CLAUDE.md context files to give each agent a defined role, data access, and escalation path. The setup replaces generic AI prompting with specialized agents invoked by ...

Dev.to - AI · 2026-04-22

I burned $800 in Claude tokens so you don't have to. Here's what I'm going to share.

A developer who spent $800 on Claude API tokens over six months building with AI coding agents launched a visual management tool for Claude Code called MC-MONKEYS and plans to publish guides on AI agent workflows and token cost optimization.

Dev.to - Claude · 2026-04-22

What Building a Geopolitical Simulation Taught Me About Claude Code

A developer built GeoSim, a geopolitical simulation engine using Next.js 14, Supabase, and the Anthropic API, in which six AI agents representing world powers simultaneously plan moves across branching timelines. The project used Claude Code's hook system to auto-run tests on file save and a 216-...

Dev.to - Claude · 2026-04-22

Playing DOOM in Claude Code's Statusline (and Fighting Its Renderer to Keep It There)

A developer implemented the 1993 DOOM engine running inside Claude Code's terminal statusline, using the doomgeneric C library to render frames as 24-bit ANSI and exposing game controls via a UserPromptSubmit hook and MCP server. The project uses four of Claude Code's existing extension points an...

Dev.to - Claude · 2026-04-22

I replaced my entire backend team with Claude Code for 30 days day 15 was a disaster

A solo developer ran a 30-day experiment requiring all backend code for a client's Node.js/PostgreSQL scheduling API to be drafted by Claude Code first. The experiment produced fast results on routine backend tasks but encountered a significant failure on day 15 involving a database error.

Dev.to - Claude · 2026-04-21

How to Build Persistent Memory Into Claude Code Agents (Cross-Session Identity That Actually Works)

A developer published a method for adding persistent memory to Claude AI coding agents using a file-based system with a lightweight index loaded each session and on-demand retrieval of structured markdown files covering user profiles, project state, corrections, and external references. The appro...

Dev.to - Claude · 2026-04-20

Rally(class project)

Two students built Rally, a location-based social platform for posting and joining local activities, as a class project over two 2-week sprints using Anthropic's Claude Code for planning, coding, testing, and deployment. Their workflow included a CLAUDE.md configuration file, automated lint and t...

Dev.to - Claude · 2026-04-20

We're Launching on Product Hunt Tomorrow — Here's What We Built

Whoff Agents, a set of Claude Code skill packs and tools, launched on Product Hunt on April 21st, offering products priced from $29 to $99, including a TDD/debugging skill pack, a Next.js SaaS boilerplate, and an MCP security scanner covering 22 attack vectors.

Dev.to - Claude · 2026-04-20

5 Claude Code Instances in Parallel with git worktree — Eliminating stash Conflicts

Using `git worktree`, developers can assign each of multiple parallel Claude Code instances its own isolated working directory and branch, preventing `git stash` operations in one instance from overwriting uncommitted changes in another. The approach replaces stash with WIP commits before rebasing.

Dev.to - Claude · 2026-04-19

opencode vs Claude Code — six weeks in, here's where I actually land

A developer compared Claude Code and opencode over six weeks, finding Claude Code faster on a refactoring task (9 vs. 16 minutes) with more mature multi-step workflow tooling, while opencode supports 75+ AI providers, is free as a standalone tool, and allows local model execution.

Dev.to - Claude · 2026-04-19

You've Been Using Claude Wrong. Here's Agent Mode

A tutorial contrasts using Claude as a chat tool versus agent mode, where Claude Code and Model Context Protocol (MCP) allow the model to read codebases, edit files, run tests, and interact with external tools like GitHub and Slack autonomously. A cited survey found 55% of engineers regularly use...

Dev.to - Claude · 2026-04-19

Automating Solo SaaS Customer Support with Claude Code Schedule — FAQ, Bug Fix, Escalation

A developer published a method using Claude Code CLI's Schedule feature to automate SaaS customer support, running hourly checks that classify tickets into three categories: FAQ auto-replies (similarity score above 0.7), automated bug fixes, or human escalation for billing and complex issues.

Dev.to - Claude · 2026-04-19

CLAUDE.md vs System Prompt: What Actually Controls Claude Behavior

In Claude Code, system prompts are ephemeral API-level instructions that reset each session, while CLAUDE.md is a persistent, project-scoped file stored in the repository that Claude reads automatically at session start. When the two conflict, CLAUDE.md instructions are treated as high-priority p...

Dev.to - Claude · 2026-04-18

Claude Code accounts switcher, Finally!!

A developer released "claud-code-account-switcher," an npm package that allows Claude Code users to switch between multiple accounts while preserving each account's authentication, history, plugins, and MCP server configurations. It is available via `npm install -g claud-code-account-switcher`.

Dev.to - Claude · 2026-04-18

Adding a new content type to my blog-to-newsletter tool

Simon Willison updated his blog-to-newsletter tool to include a new content type called "beats" — posts capturing external activity like open source releases and museum visits — by prompting Claude Code to clone a reference GitHub repo and modify the relevant HTML file in a single session.

Simon Willison · 2026-04-18

Codex for (almost) everything

OpenAI released a major update to Codex, used by over 3 million developers weekly, adding background computer use, an in-app browser, image generation via gpt-image-1.5, more than 90 new plugins, GitHub PR review support, SSH connectivity, scheduled task automations, and a memory feature for reta...

OpenAI Blog · 2026-04-17

OpenAI takes aim at Anthropic with beefed-up Codex that gives it more power over your desktop

OpenAI updated its Codex agentic coding tool with expanded desktop control capabilities, positioning it as a competitor to Anthropic's Claude Code. The update gives Codex broader ability to interact with a user's desktop environment.

TechCrunch - AI · 2026-04-17

OpenAI’s big Codex update is a direct shot at Claude Code

OpenAI updated its Codex desktop coding tool with the ability to operate desktop apps on macOS, generate images via gpt-image-1.5, browse the web natively, schedule tasks, and retain memory from past sessions. The update also adds plugins for GitLab, Atlassian Rovo, and Microsoft Suite, with EU a...

The Verge - AI · 2026-04-17

claude-studio: A Visual Orchestration Platform for Claude Code Multi-Agent Workflows

A developer released claude-studio, an open-source visual orchestration platform for managing multi-agent workflows using Anthropic's Claude Code. The tool provides a graphical interface for coordinating multiple Claude AI agents working in parallel.

Dev.to - Claude · 2026-04-17

Have you seen a new sidebar from Claude Code? It looks great, but...

Claude Code, Anthropic's command-line coding tool, received a new sidebar interface. A developer noted the visual update favorably but indicated concerns or caveats about it in a post on Dev.to.

Dev.to - Claude · 2026-04-17

Claude Code Just Got a Desktop Redesign — Here's What Changed!

Anthropic released a redesign of its Claude Code desktop app featuring a sidebar for multi-project session management, an integrated terminal pane, a side chat function (Ctrl+;) for context-aware queries, and consolidated model and effort controls.

Dev.to - Claude · 2026-04-15

Claude Code can now do your job overnight

Anthropic launched "routines" for Claude Code, allowing automated tasks to run on schedules, via API calls, or GitHub webhooks on Anthropic's cloud infrastructure, replacing manual GitHub Actions setups for tasks like issue triage and smoke testing.

The New Stack · 2026-04-15

Anthropic’s redesigned Claude Code desktop app lets you burn through tokens even faster

Anthropic released a redesigned Claude Code desktop app with an integrated terminal, improved diff viewer, side chat functionality, and rearrangeable interface panes for managing multiple coding sessions simultaneously.

The New Stack · 2026-04-15

Claude Code Routines

Anthropic released documentation for Claude Code Routines, a feature within its Claude coding platform available at code.claude.com.

Hacker News - Best · 2026-04-15

10 GitHub Repos That Turn Claude Code Into a Productivity Machine

Ten open-source GitHub repositories provide extensions and integrations for Claude Code, including Repomix for codebase context, Dify and Flowise for visual workflow builders, and Onyx for self-hosted AI alternatives. Installation is available via terminal commands or plugin marketplace.

Dev.to - Claude · 2026-04-12

Use and manage Vercel Sandbox directly from the Vercel CLI

Vercel added Sandbox management to its CLI tool through a new `vercel sandbox` subcommand, eliminating the need for a separate command-line tool. The feature is available in Vercel CLI version 50.42.0 and later.

Vercel Blog · 2026-04-09

Claude Code's Feb–Mar 2026 Updates Quietly Broke Complex Engineering — Here's the Technical Deep-Dive

Anthropic's February-March 2026 updates to Claude Code—including adaptive thinking, lowered default effort settings, and hidden reasoning display—contributed to degraded performance on complex engineering tasks, with the community identifying under-allocated reasoning budgets and system prompt bi...

Dev.to - Claude · 2026-04-09

Claude Code: Self host model configuration

Claude Code can be configured to use self-hosted models by setting ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN environment variables, then running claude with the --model flag to specify a local model like qwen3-coder-next.

Dev.to - Claude · 2026-04-09

MemCTX – Autonomous session memory for Claude Code (open source, MIT)

MemCTX is an open-source MIT-licensed tool that maintains session memory for Claude Code by storing sessions in SQLite, auto-generating summaries via Claude API, and injecting relevant history into new sessions through a dashboard interface.

Dev.to - AI · 2026-04-08

GitHub Copilot CLI combines model families for a second opinion

GitHub introduced Rubber Duck, an experimental feature in Copilot CLI that uses a second AI model to review coding agent plans before execution. Testing showed Claude Sonnet paired with GPT-5.4 as Rubber Duck achieved 74.7% of the performance gap between Sonnet and Opus, with larger gains on comp...

GitHub Blog · 2026-04-07