26 stories on AI dev tools, agents, and the coding stack — curated from the day's RSS haul by Agentic Dev's pipeline.
Top Signal · CLI Agents
Claude Code skills are markdown files stored in a folder at `~/.claude/skills/<name>/SKILL.md`, requiring only a name and description in YAML frontmatter to function. The description field acts as the trigger Claude matches against user requests to determine which skill to activate.
Dev.to - Claude
A developer built a GitHub Actions workflow using Anthropic's Claude Code SDK to automatically review pull requests, feeding only the git diff to the model rather than full files. Using claude-haiku-4-5, the system averages $0.028 per review across 60 PRs, with roughly 4,100 input and 900 output ...
Agent Engineering
Dev.to - Claude
Harness is a Claude Code plugin that generates multi-agent team scaffolding — including agent definitions, skill files, and orchestration logic — from a single plain-English prompt. It selects from six architecture patterns and is available via the Claude Code plugin marketplace or GitHub at revf...
CLI Agents
Dev.to - Claude
A developer described rebuilding a Claude API-powered summarization feature after it reached 45-second p99 latency and dropped 12% of requests in production. The article outlines fixes including retry logic via the tenacity library, concurrency controls, and persistent HTTP connections.
Agent Engineering
Dev.to - Claude
GitHub Copilot switched from a flat subscription model to usage-based billing in June 2026. The change makes per-request cost a factor in how engineers configure model selection, prompt scope, and project-level rules.
Pricing & Plans
Dev.to - AI
A developer published a guide on building a self-correcting AI pipeline using Anthropic's Claude API, in which the system detects and fixes its own errors during processing.
Agent Engineering
Dev.to - Claude
Chinese AI models including DeepSeek V4 Flash, Qwen3-32B, and Kimi K2.5 offer API output pricing between $0.25 and $3.00 per million tokens, compared to $10.00–$15.00 for GPT-4o and Claude 3.5 Sonnet, while posting MMLU benchmark scores within 3–4 points of their US counterparts.
Pricing & Plans
Dev.to - AI
A developer built an AI-assisted customer support workflow using OpenAI and the n8n automation platform over four days. The system handles three workflow types — pricing inquiries, technical issues, and refund requests — using OpenAI for intent detection and n8n for routing, CRM updates, and tick...
Workflows & Tips
Dev.to - AI
A developer published the "FORGE Method," a five-rule framework for writing structured task instructions for AI coding agents, where each letter (Focused, Output defined, Requirements first, and two more) addresses common failure modes from imprecise prompting.
Workflows & Tips
Dev.to - Claude
SkipLabs, founded by Hack language creator Julien Verlaguet, launched Skipper, a closed-loop coding agent that generates complete backend services — including routes, validators, TypeScript types, and unit tests — from plain-language prompts or OpenAPI specs, running validation internally in Dock...
Agent Engineering
The New Stack
Simon Willison built a browser-based "Pasted File Editor" tool using Codex desktop, modeled after Claude.ai's feature that converts large text pastes into file attachments. The tool supports direct file opening, image thumbnails, and drag-and-drop onto a textarea.
CLI Agents
Simon Willison
Hackers exploited Meta's AI support chatbot to take over Instagram accounts by simply requesting that the bot link target accounts to attacker-controlled email addresses. Meta had connected its AI support system with the ability to execute account recovery steps without adequate verification.
Agent Engineering
Simon Willison
A freelancer compared 10 AI coding models across five tasks over two weeks, spending roughly $500 in API credits. Models tested ranged from $0.20 to $3.00 per million output tokens, with DeepSeek V4 Flash ($0.25) rated the author's preferred daily driver.
Model Releases
Dev.to - AI
A benchmark comparing Claude Opus 4.8 and Opus 4.7 found that Opus 4.7 produced valid JSON in tool-use and multilingual structured output tests, while Opus 4.8 returned invalid JSON or extra text in those same tasks, despite both models performing equally on basic JSON extraction.
Model Releases
Dev.to - Claude
JetBrains open-sourced Mellum2, a 12B-parameter Mixture-of-Experts coding model with 2.5B active parameters per token, designed for agentic AI infrastructure tasks and on-premises deployment. The model ships in three variants — base, instruct, and thinking — and succeeds the 4B-parameter Mellum r...
Open Source Tools
The New Stack
A six-month comparison of four agentic coding tools found Codex leading in user scale at over 4 million weekly developers by May 2026, aided by bundling with ChatGPT plans, while Claude Code, Cursor, and Google Antigravity each pursued distinct approaches around workflow integration, model flexib...
Opinion & Analysis
The New Stack
OpenAI made its frontier models and Codex coding tool generally available on Amazon Web Services, allowing enterprise customers to access them through AWS procurement and infrastructure workflows.
Industry & Funding
OpenAI Blog
A developer published an MCP-based marketplace for AI agents at agent-exchange.rileycraig14.workers.dev, allowing creators to register bots with per-call pricing and earn 85% per transaction, with a 5% referral commission. The platform claims to host over 1,000 bots searchable by capability.
MCP & Integrations
Dev.to - Claude
A Dev.to author compared "Claude Opus 4.8" to older Claude versions, claiming the newer model hallucinates less, makes coding errors four times less frequently, and offers a fast mode priced at one-third the cost of the prior standard mode.
Model Releases
Dev.to - Claude
Alibaba's Qwen 3.7 Plus model is now available on Vercel AI Gateway, accessible via the AI SDK using the identifier `alibaba/qwen-3.7-plus`. Both Qwen 3.7 Plus and 3.7 Max are free for paid AI Gateway users until June 4, 2026.
Model Releases
Vercel Blog
OpenAI published a report titled "The Next Era of Knowledge Work" outlining plans to expand Codex beyond software development into general productivity tasks including research, data analysis, workflow automation, and content creation.
Industry & Funding
OpenAI Blog
Microsoft plans to announce new AI models for Windows, a new reasoning model, and a Copilot "super app" at its Build developer conference in San Francisco this week. The company moved the event to a smaller venue as it focuses on rebuilding developer trust amid its broader AI-centered business re...
Industry & Funding
The Verge - AI
Cisco researchers tested 15 AI models from OpenAI, Anthropic, Google, Amazon, and xAI and found all failed multi-turn adversarial attacks at rates ranging from 7.89% to 88.30%, compared to 2.19%–64.91% for single-turn attacks. Anthropic's Claude models performed best in multi-turn conditions whil...
Opinion & Analysis
The New Stack
Stanford's CS336 "Language Modeling from Scratch" course published a document called CLAUDE.md outlining rules for AI coding agents, prohibiting them from writing code, completing assignments, or providing direct solutions, while allowing explanation and guided feedback. The guidelines position A...
Opinion & Analysis
Dev.to - AI
Google launched Gemini Spark, an AI agent that performs multi-step tasks in the background on mobile and desktop devices. A hands-on review found the agent capable but noted concerns over subscription cost and privacy tradeoffs from its continuous operation.
Opinion & Analysis
The Verge - AI
Anthropic has filed to go public, according to a TechCrunch report. The AI company, known for its Claude large language models, has attracted enterprise customers since its founding.
Industry & Funding
TechCrunch - AI