#ai-agents
22 articles
-
The Harness Is Everything
What Cursor, Claude Code, and Perplexity actually built -- and why the model was never the real product.
-
The Last Generation of Data Engineers?
How agentic platforms are quietly making the pipeline-builder's job description obsolete -- and what survives the transition.
-
Claude Code's /ultraplan Is the Feature That Was Hiding in 512,000 Lines of Leaked Code
Anthropic just shipped the first major feature that security researchers spotted weeks ago in its accidentally exposed source code. And it turns out the leaked description wasn't hype.
-
Did Claude Code Opus 4.6 Get Nerfed?
A senior AMD AI director's logs point to sharp regression in Claude's coding performance -- Anthropic says it's a product change, not a dumber model.
-
AI Engineer vs Data Engineer vs MLE: Who Actually Ships Agentic Systems?
Org charts are lying to you. The job title says 'AI Engineer' but the work varies wildly. Here's the honest picture of who actually builds and ships agentic systems in 2026.
-
The Agent Harness Is the Real Product, Not the Model
A packaging error exposed 512,000 lines of Anthropic's source code. What it revealed wasn't the model -- it was the orchestration layer that changed how the industry thinks about where AI value lives.
-
Anthropic Just Made Building Production AI Agents Dramatically Easier
Claude Managed Agents enters public beta — a fully managed infrastructure layer that handles the months of scaffolding work that used to block every production agent deployment.
-
He Stopped Applying to Jobs and Built a System That Did It For Him
Career-Ops: an open-source multi-agent system built on Claude Code that evaluated 740+ job offers, generated 354 tailored CVs, and landed its creator a Head of Applied AI role. Now on GitHub.
-
AutoAgent: The AI That Engineers Its Own Harness and Tops Benchmarks
AutoAgent autonomously builds and optimizes agent harnesses without human engineering, achieving #1 on SpreadsheetBench (96.5%) and top GPT-5 score on TerminalBench (55.1%) in 24-hour runs.
-
The Engineer Who Made Claude Build a DAW in 4 Hours — And What He Learned About Harness Design
A solo agent costs $9 and ships broken software. A three-agent harness costs $200 and builds a retro game maker from one sentence. Here's how the engineering actually works.
-
Your Data Stack Wasn't Built for This: Architecting for AI Agents
DuckDB has an MCP server. Databricks bakes LLM functions into ETL. AI agents are becoming the most demanding query clients your infrastructure has ever seen.
-
What If You Could Run the Future Before It Happens? Meet MiroFish.
MiroFish spawns thousands of AI agents with memories, personalities, and opinions, then watches what breaks loose — 45k GitHub stars and counting.
-
Claude Subconscious Gives Claude Code a Persistent Memory That Actually Works
A background agent that runs silently after every Claude Code response, building and surfacing context across sessions without adding latency to your workflow.
-
Build a RAG System Without Embeddings or Vector Databases
PageIndex turns documents into navigable trees. An LLM reasons through the hierarchy to find answers — no embeddings, no similarity search, just structured retrieval.
-
An AI Agent Made $19,915 in 8 Hours. The Benchmark That Proved It Is Open Source.
ClawWork dropped 220 professional tasks across 44 job categories, gave AI agents $10 each, and told them to survive. One agent turned that into nearly twenty grand.
-
The AI Doesn't Need to Read Your Codebase. It Needs a Map.
Context Hub, Code Review Graph, and the emerging discipline of giving AI agents less to make them smarter.
-
Databricks Agent Bricks Is Quietly Changing How Data Engineers Work
Describe the task. Connect your data. Let the platform handle the rest. That is the promise of Agent Bricks — and for a specific, important set of data engineering problems, it is actually delivering on it.
-
GitNexus Gives AI Agents a Nervous System for Code
AI coding agents are blind — they read files but don't see structure. A 16K-star open-source project is changing that by building knowledge graphs that make agents actually understand codebases.
-
CLI-Anything turns Photoshop into a terminal command
One plugin scans a desktop app's source code and generates a complete command-line interface for it, so AI agents can use software that was built for humans.
-
Karpathy Let an AI Agent Do ML Research While He Slept — It Ran 100 Experiments by Morning
AutoResearch gives an AI agent one file, one GPU, and one metric. The agent modifies the code, trains for 5 minutes, checks if it improved, and repeats all night long. The results are surprisingly good.
-
Claude Code Puts an AI Agent in Your Terminal — And It Actually Works
Anthropic's agentic CLI reads your codebase, edits files, runs commands, and commits changes. No IDE plugin. No web UI. Just a terminal that understands what you're building.
-
The Claude Certified Architect Is Here -- And It's Unlike Any AI Certification Before It
Anthropic launched its first official technical credential -- a proctored, architecture-level exam that separates people who use AI from people who build production systems with it.