I wrapped up two big initiatives this week—moving a set of services from cloud platforms to Linux VPSs, and finally launching this blog. That freed up some mental space to start experimenting with agentic, command-line coding tools.
I spent time with two of them:
These tools live in a very different place than ChatGPT in a browser or Copilot-style features embedded directly into an IDE (the programmer equivalent of Word where we write and organize code). I’ve generally avoided IDE-embedded tools because the workflow never quite fit me.
CLI tools are used through a terminal window on your computer, usually from inside the project’s folder. They can inspect the entire project and perform most of the same actions you can from the command line—reading files, running builds, and modifying code.
This post is a snapshot of what I’ve seen so far. What impressed me, where friction showed up, and the questions I’m now carrying forward.
(Note: This post was generated by AI based on my notes, then edited by me for accuracy and clarity)
Most people have seen AI through a web interface: you type a prompt, you get a response. That works well for explanations, brainstorming, and small chunks of code.
Agentic CLI coding tools go further.
They can:
Instead of asking for a small chunk of code, you can ask for a piece of a system.
For non-technical readers, think of this like writing a book.
I’m testing these tools while building a small-to-mid-sized AI-assisted editorial pipeline in .Net with a Razor Pages front end. Without AI help, this would be a 6–8 week project. The goal is to reuse parts from another system while building new workflows on top.
That makes it a good test case:
A few things genuinely stood out:
Direct database awareness
The tools could query my curriculum database directly. I didn’t have to paste examples or explain table relationships.
Entity Framework and migrations
They set up EF, EF Migrations, and even seeded data through migrations. That work is usually tedious and easy to get wrong.
Pattern matching from existing projects
They copied core infrastructure from a source project (base classes, services, CSS, tag helpers). The tools recognized my patterns and used them in new modules.
Running and fixing builds
When something didn’t compile, the tool ran the build, saw the error, and fixed it.
Larger, coherent chunks of work
After enough context was built up, the tools could generate full modules—multiple tables, data models, input models, Razor pages—in a single pass. I normally rely on custom code generators for that kind of work.
Codebase reorganization
Moving files, reshaping folders, and updating references worked better than I expected.
Working this way starts to feel less like “writing code” and more like directing construction.
The limitations were just as instructive:
Chunk size still matters
You can work in larger chunks than browser-based tools, but not unlimited ones. For some of the larger chunks they built, I'm going back and rebuilding in smaller pieces at a time.
Architecture still has to come from somewhere
The tools don’t invent good structure on their own. You have to provide or reinforce the architectural intent.
Over-scaffolding is easy
When I asked for a core workflow scaffold as just the UI, I got more than I needed. Removing unnecessary UI pieces was harder than layering things in gradually. That mirrors how most human developers already work: start small, expand intentionally.
Cost and limits are real
I burned through Claude credits quickly and would need to go to the $100/month plan for daily use. OpenAI’s limits were less visible. I worked for five hours one day on a $20/month plan and didn't hit any limit.
Even with these tools, some responsibilities don’t move:
The tool accelerates execution. Judgment still sits with the human.
This is where things get interesting.
Will architecture matter as much?
Architecture shapes quality, scalability, reuse, and maintainability—mostly for humans. Some of those priorities may become inputs to the AI rather than artifacts we optimize for ourselves.
Will code organization matter or change?
Today, code is organized for human discovery and maintainability. If AI becomes the primary reader and editor, does that constraint loosen or change?
How far does natural language go?
Today, the C# I write isn’t the code that actually runs. It compiles to intermediate language, then to machine code I’ve never bothered to read—and don’t need to. C# exists because it’s a form humans can reason about.
Natural-language coding feels like the next layer in that stack. We may stay aware of the code and its behavior, but we interact at a level optimized for intent rather than syntax. We step in manually only when something subtle, structural, or high-stakes requires it. And like intermediate language, this layer may exist almost entirely outside our attention.
Building professional-grade software still takes weeks or months. That hasn’t changed.
What has changed is who can plausibly do it—and how much of the work shifts from mechanical execution to intent, direction, and validation.
I’ll keep experimenting. Future TWILs will probably show more opinions as patterns stabilize. For now, this feels like the early innings of a tooling shift that will eventually reach far beyond programmers.