This Week in Learning - Agents Need Management Too

The most important skill in AI-assisted development isn't prompting. It's management.

I spent this week building agents, shipping code with agents, and cleaning up after agents. And the pattern that kept showing up had nothing to do with the technology. It was the same pattern I've been teaching in leadership workshops for years: the work succeeds when it's properly managed.

Here's what I learned this week:

  • Agents break when you give them deterministic work. The fix looks a lot like good delegation.
  • Agentic workflows aren't a new paradigm. They're the same good practices, compressed.
  • Letting agents run for hours produces impressive output and expensive cleanup. The balance isn't obvious yet.

Put the Deterministic Stuff in the Box

At my day job, I regularly review PRs with a specific type of change. The review is repetitive enough that I thought it should be an agent. So I built one.

The review is straightforward: pull down the changed files from a PR, pull the same folder from the target branch, run a quality rubric against the new files, check for clashes with existing files, and generate a report in markdown with any errors or recommendations.

Version 1 gave my new agent too much to do. The developer agent wrote a few PowerShell scripts for the git operations and rubric evaluation, then built commands for my new agent to run them.

It didn't work well. The new agent kept going rogue. It tried to debug the scripts instead of just running them, ignored clear instructions for how to execute things, and improvised when it should have been following a recipe. This is what current-day agents do when you give them deterministic work. They treat it like a problem to solve instead of a process to execute.

Version 2 moved all of that into MCP tools. MCP (Model Context Protocol) is basically a way to give an agent a toolbox of reliable, pre-built operations. Think of it as the difference between handing someone written instructions and hoping they follow them, versus giving them a machine that does the thing when they press the button.

So now the git work, the rubric evaluation, and the report generation all live inside MCP tools. The agent's job is just the stuff agents are actually good at: ad-hoc interaction with me, evaluating the artifacts with judgment, and a natural language interface I can use to update the rubric as we discover new failure modes.

Night and day difference. The agent stopped fighting the process and started doing what it's good at.

I used BMad Quick Flow for the spec and planning, and the BMad Agent Builder to construct both versions. The tools made it easy to iterate, which mattered because the first version taught me what the second version needed to be.

The takeaway: agents are good at judgment, conversation, and handling ambiguity. They are bad at following precise scripts reliably. Design and manage for that. Put the deterministic stuff in a box the agent can call, and let the agent focus on the work that actually needs a brain.

Agents Don't Change the Game, They Speed It Up

I keep hearing people talk about AI changing everything about software development. From where I'm sitting, it's not changing the approach — it's streamlining it.

This week I moved a project from proof-of-concept into production development. This was a complicated problem. Complicated enough that it needed a POC to prove it could work at a certain level of quality before I committed to a full build.

I didn't skip that step because I had agents. I built a specialized agent for the POC work, one with deep knowledge of the problem space who could help me explore and validate the approach. Just like in the real world, I would have found a subject matter expert to help me prove the concept before handing it to a delivery team.

When the POC proved out, I packaged it up and handed it off to the standard BMad development workflow — the same way I would have packaged it for a human sprint team. The architect agent evaluated the POC and existing framework. The scrum master built detailed stories with full context. The dev agents executed. My job was the same as any project lead: making sure each handoff carried enough context for the next person to succeed.

Same discipline. Same rigor. Same management. Just faster.

If you're hoping agents will let you skip the hard thinking, the scoping, the validation, the careful handoffs, they won't. The fundamentals still apply. Agents just compress the timeline.

Can Someone Show Me the 5-Hour Agent Run?

I keep hearing people talk about letting software development agents run autonomously for 5+ hours. I genuinely want to know how.

Here's what my week looked like: I went through the full BMad planning path. The result was around 24 detailed stories across the first two epics. I provided the POC code and an existing framework, both of which were evaluated and documented by the architect agent. I worked closely with the scrum master agent, giving it extra context and making sure it was referencing the POC as it built stories.

Then I let the dev agent and an adversarial code review agent run for a couple of hours.

The result was impressive. Maybe 90% of what I needed. But the 10% it got wrong was expensive. The POC had 18 well-tuned prompts that were the product of careful iteration. Instead of copying them into the project code, the developer agent stripped out most of the tuning. It kept the general shape but lost the precision that made them work. I had to have my custom agent that built the POC move things over properly.

Honest math: the agents produced what would have normally taken me a couple of weeks. Then I spent about half a day cleaning up what they got wrong.

I'm not complaining. That's a massive net win. But I'm questioning the narrative that you can just let agents run for hours and walk away. My experience says you still need to work in layers — plan, execute a bounded chunk, review, adjust, repeat. The same way you'd work yourself or manage a human team.

If you've figured out how to run development agents for 5+ hours without generating hours of cleanup, I would genuinely love to hear about it. Different tools? Different prompting techniques? A different approach to context and handoffs? Please share. I'm learning too.

What I'm Taking Into Next Week

The thread connecting all three of these is the same thread I've been pulling on for years, just from a different angle.

Managing agents well requires the same skills as managing people well. Scoping work clearly. Knowing when to check in versus when to let someone run. Reviewing output with a critical eye. Providing enough context without micromanaging. Knowing what to delegate and what to keep close.

The people who will get the most from agentic aren't necessarily the best programmers, they're the ones who embrace being managers.

The question I'm sitting with heading into next week: what happens when the industry realizes the bottleneck isn't technologists — it's managers.

Enjoy the journey.