This week, I focused on using the BMad (Breakthrough Method of Agile AI Driven Development) with a real project. BMad is a collection of 12+ agents trained to act as domain experts in software development.
BMad's (GitHub | Docs) Agents are just sets of instructions stored in files. These instructions define behaviors, goals, workflows, quality standards, etc. that LLMs must follow when responding.
The LLM "Agent" then uses these files as context as it breaks apart requests into tasks, and then completes those tasks using the underlying model and tools.
These run inside the CLI (Command-Line Interface) version of tools like Claude Code, Codex, and Copilot.
BMad's primary workflow has you start with an Analyst agent to research, refine and document your project idea. In my case, I'm working on a major feature addition to an existing edtech product. This would normally be about a 12-week project.
The Analyst prompted me to describe the product and new feature. Who are the users, what is the problem, how will it work, etc. I was able to give it artifacts from previous work including existing user personas to study.
It produced a detailed Product Brief that is used by the other agents.
I then worked with the Tech Writer agent to do a deep study of the existing codebase. It extracted architectural patterns, coding conventions, and stylistic norms. I was able to point out things that are important but maybe non-standard. This produced seven documents including Architecture, Source Tree Analysis, Development Guide, Contribution Guide, etc. Basically everything a new developer joining the team would need.
The Project Management agent then walked me through several workflows that produced a Product Requirements document. It included 58 functional requirements across 11 categories (User Access, Onboarding, Learning, Engagement, Gamification, Billing, Reporting).
The UX Designer agent then analyzed the existing system and created a Design Specifications document.
The Architecture agent then stepped in and inspected the Product Requirements document and existing architecture to explore if anything would need to change.
The Project Management agent then worked with me again to break everything down into 14 epics with 56 stories. These were stored in an Epics document.
It also walked me through an implementation readiness workflow which produced an Implementation Readiness Report.
The first three phases took around six hours. I have been thinking about this feature for a couple of months, so my ideas were mostly clear. If this was a fresher idea, this process may have taken longer but any added time would have been worth it.
I worked with both Claude Code and Codex during these first three phases.
The Scrum Master agent then stepped in to help me start planning each epic and write much more detailed stories. There is a lot of context and patterns in my existing codebase, and it did a good job of studying and matching.
The stories it produces are very extensive and include files, code patterns, code snippets, etc. It's something I would expect to see more from a senior developer making a detailed plan before starting their work or handing off to a junior developer.
By default, the Scrum Master agent operates in "YOLO" mode. You give it a story number and it does the research and writes the story with no interaction with you.
That didn't work for me because there are a lot of nuances to the parts of the app it was extending or mimicking. So after it produced a very detailed story, I would have to work with it to refine it.
After a few stories, I used Codex to help me understand the agent's training, and change it to include a prompt at the beginning. Now, instead of going into YOLO mode, it shares the details of the story and asks if I want to provide any context before it starts. This has helped tremendously. The agent and I now can explore possible solutions. Once we agree, it will go in YOLO mode and write everything out.
I then worked with a Test Engineering Analyst agent who writes end-to-end (E2E) tests that I set up to run in Playwright and NUnit. The agent also reviews and improves the story's requirements for unit tests the developer will write.
The Developer agent then gets the story.
Let's stop and think about everything we've done to this point and the artifacts we've created. We're not just handing off a prompt like "add two-factor authentication to this app."
We're handing off context around the project, architecture, coding standards, very detailed requirements that are more like a code plan, testing requirements, etc.
The Developer then goes on to do the work. In most cases this takes 10+ minutes for the Developer agent to do everything.
It writes code and tests. It sets up databases and seeds data. Basically anything in the story, it does. It then runs the app and makes sure unit tests pass, etc.
When the Developer is done, it then goes through a code review workflow with a second Developer taking an adversarial posture.
I then step in to review the code and can work with a developer agent if I find any issues. I can run the Playwright and unit tests myself.
I've worked through around 10 stories so far. The first developer gets things right around 95% of the time. The second developer finds a few issues and gets things to 99%. I've rarely had to step in and fix anything.
The code matches my style perfectly. There are a few things I may want to go back and research and change over time, but that's the case anytime I delegate anything to a different developer.
Once I'm satisfied, I work with the Scrum Master agent to close the story.
Once all stories for an epic are complete, the Scrum Master offers an optional retrospective with different Agent personas. The group looks at the artifacts produced during planning and development of those stories.
The retrospectives have proven to be very useful and help shape future work. One story in Epic 4 had a couple of issues, so I asked the agents to think things through and see if we might need adjustments.
I asked the Scrum Master to summarize that part of the conversation. Here's the summary.
The Complexity Conversation
The pivotal moment came when Scott asked: "Are these stories too large? Are these chunks too large to think through all aspects?"
This sparked a team analysis that revealed story size (task count, lines of code) is less predictive of issues than story complexity. The team collaboratively built a framework:
Complexity = (Number of Things x Interdependencies x Speed of Change) + Novelty
Story 4-2 had the most tasks but also the highest interdependencies (filter coordinating with three separate page models). That's where the security gap slipped through - not because anyone was careless, but because cognitive load exceeded capacity.
From Analysis to Action
Rather than leaving this as an abstract insight, Scott asked the team to look at future stories, and the team immediately applied the framework to Epic 5. Stories 5-2 and 5-5 lit up as high-risk.
The discussion then shifted to: "Where are the natural seams to split these?"
Charlie identified architectural seams (UI vs orchestration) as clean boundaries. Alice noted that the referral feature in 5-5 is almost a mini-product that could evolve independently. The team reached consensus on specific splits before leaving the retrospective.
After the retro, the epic is closed and development on the next one can begin.
Here's what I started doing yesterday.
This project is a .Net 10 app that I have developed in Visual Studio. So I have Visual Studio open to review the code and run it locally.
I open VS Code and point it to the solution folder.
I open Obsidian and create a vault from the solution folder.
In VS Code, I open four terminals and drag them into tabs. I log into Claude Code in each terminal.
I first work in the Scrum Master tab writing and refining stories. The stories are markdown files which I can preview in VS Code, but I prefer to review and adjust them in Obsidian.
When the Scrum Master finishes a story, I switch to the Test tab and work with the Test agent to refine and improve the story. While it is working, I go back to the Scrum Master tab and start working on the next story.
When the Test agent is complete, I switch to the Dev 1 tab and give it the story the Test agent just finished. Meanwhile, I go back to the Scrum Master and Test tabs and keep them working on the next stories.
When the Dev 1 agent is complete, I switch to Dev 2 tab and give it the story Dev 1 just completed for code review. I then give Dev 1 the next story that is ready for dev.
It's not uncommon for me to have four agents working at the same time. This worked pretty well, but the cognitive load was pretty high. I also had to subscribe to Claude's highest tier.
I'll keep experimenting with this. I don't know if I'll keep four tabs running all of the time, but I want to experiment and see what my limits are.
(I asked AI to extract these from my raw notes and the rest of this post which I wrote)
A few things stand out after working this way for a few sessions.
First, agentic workflows reward clarity more than cleverness. The quality of the output had far less to do with model choice and far more to do with how well the artifacts constrained the work. The moments where things went sideways were almost always traceable to fuzzy requirements, hidden assumptions, or stories that bundled too much complexity together.
Second, this feels closer to delegation than prompting. Once the groundwork is laid, the work shifts from “tell the model what to do” to “review, challenge, and refine what a capable teammate produced.” That changes where my energy goes—from typing to thinking.
Third, throughput increases quickly, but cognitive load becomes the new bottleneck. Running multiple agents in parallel worked, but it required deliberate pacing and strong mental context switching. I’m not convinced more agents always equals better outcomes.
Finally, this workflow makes hidden system dynamics visible. Complexity, interdependencies, seams, and risks show up earlier and more explicitly. That continues to push how I think about story sizing and planning.
I’m still early in this experiment, but the combination of structured planning, explicit artifacts, and agent specialization feels less like automation and more like a new way of working.