Decorative circuit board background pattern
Agent Specialists Changed How I Build Software

Agent Specialists Changed How I Build Software

JL

Jay Long

Software Engineer & Founder

Published March 18, 2026

From the IDE to the Command Line

I had these pre-existing clients where I was building custom SaaS products, and for a while I was just plugging along with Cursor. Me and Cursor's composer, flipping between models based on what made sense. It was comfortable because it was all IDE-based. You had your terminal right there, you could flip back and forth easily, and it funneled you toward reviewing code in a traditional VS Code style. That was really familiar, and honestly, looking back, it was just a comfort thing. It made me feel like I had the ability to review everything in a way I already understood.

I remember ThePrimeagen talking about forcing himself to use Neovim for two weeks and then never going back. I was totally going to do that, but I never got around to it because I was too busy and the learning curve was steep. I did start using Vim over Nano on servers just to get more familiar with the commands. The reason I bring this up is that if I had made that switch, I think I would have been a faster adopter of Claude Code.

And it's weird to say, because I'm a Linux guy. I've been managing servers for years. I had an Ubuntu workstation for several years. I work with Docker. I'm very comfortable in a terminal. All day, every day I work in terminals. Everything you develop, you're shooting commands in a terminal. So it's strange to think that, but at the same time, when it comes to code, you're always in an IDE. Or I was, and I think most people are. Not everybody, though. Some of the heaviest hitters just use the most basic text editor. A lot of the most elite engineers I've talked to just use Vim or Notepad. Or Emacs, of course.

Now that I've gotten a taste of Claude Code, I do still keep Cursor open, but the Cursor agent is more like a meta tool. If Claude Code is getting deep into something and I just need a quick web search or some generic commands that don't need a lot of context, it's nice to have that fallback. I also keep a today.md file open. Sometimes I'll use it to marinate in my thought process while I type out the next thing I'm going to say. Sometimes I feel too rushed typing into the command line, like there's this urgency to think faster, move faster. There's a lot of power in just stopping time and letting your thoughts manifest.

Why I Stopped Reviewing Code Inline

This feels like a weird thing to admit, but I'm going to be totally honest. My reviews have moved almost entirely to GitHub. I used to feel the need to review every line of code as these things generated it. A lot of people would say I'm just going YOLO and letting the AI code everything. I think YOLO is the wrong term for it.

If I were putting it in dangerous mode on main branch and giving it the ability to push and deploy, that's YOLO. But if I put it on its own branch and make sure everything that deploys goes through a pull request, all I've done is move that review process to a better place. Here's why.

A lot of important things have completely changed because of AI coding. I am ten times, maybe a hundred times more likely to do a refactor now. I'm ten times more likely to rebuild an entire system. So think about how often I build an entire feature, an entire module, and then just scrap it or completely redo it. Now think about reviewing all that code inline as it's being written. Think about how many times I'm reviewing code that's going to get deleted. Or how many times I'm catching errors that would have been caught by the automated test suite anyway.

I guess you could call it YOLO, except that I'm going to review it in the pull request. I just religiously use pull requests for every single deployment. You should already be doing this anyway.

How Does a Test Suite Change Everything?

One of my custom SaaS clients had a tight budget. What I would consider actual YOLO is not having test coverage. I had good test coverage on the backend, but there weren't a lot of end-to-end frontend tests. I did already have the Playwright suite in place, though.

I was working on a very design-intensive part of the system, a lot of front-end layout and style changes. There were a bunch of problems with the reports, and my Claude agent told me to go look and catalog everything I found. I started typing them all out, and then I stopped and said, actually, why don't you just go build a full, thorough set of Playwright tests with end-to-end coverage? Write tests for everything that's obviously needed and just run them. I admitted it straight up: I'm slow at this. I don't mind doing it, but a lot of this stuff is obvious, and if you just write the tests, you'll catch it.

I was right. The agent found a ton of test cases, wrote tests for all of them, and when we ran it, about 90% of the issues I'd seen were just cleared up. I had a small remaining list of things that really needed human eyes, things that are just hard to see without photons hitting retinas. But my QA time dropped drastically.

And then the payoff compounded. Later that same day, I needed to upgrade the React version and completely rearchitect how components were rendered. This was an SEO performance task. A lot of stuff was being rendered client-side that should have been server-side. We had to change the entire rendering approach, rip out a ton of deprecated packages, replace them with updated ones. Because that test suite was in place, it was a fifteen-minute rewrite. I was able to regression test everything, catch every issue. Some of the tests had to be rewritten, but because they existed, they served as a spec for the post-upgrade test suite. Within about twenty minutes, everything was completely upgraded and rearchitected.

Deleting a Whole Chunk of AWS

The product I'm talking about does a lot of complex work with data behind the scenes, algorithms and reporting and calculations, but ultimately what gets sold is a PDF report with graphical representations of different data. There's a lot of layout and style that has to be perfect, and these reports are very dynamic, so getting it right across many different scenarios is tricky.

I had to rewrite the entire PDF generation module. It was burning me up that I had this Next.js app deployed to Vercel, and architecturally there was no reason it shouldn't just work there. But something weird was going on with generating PDFs, and I had to get the solution out the door to land our first client. So I asked myself, what is the fastest thing I can do tonight that I know will work?

The only dependable way to generate these graphical PDF reports from HTML was to use Selenium with a headless Chromium browser and just print the screen behind the scenes. That worked every time, but the Vercel runtime didn't like it. So I bit the bullet. We already had an AWS account, we were already using SES, I have a background in Terraform. With AI, if you can explain cloud architecture in Markdown, it can build Terraform code and you can launch it with one command. I already had it running locally with Docker Compose, and AI can translate that instantly into an ECS task definition. Within minutes I had the service deployed.

But it always bugged me. Dependable as it was, it felt hacky. I had this intuition that it was going to cause some weird problem at scale. I didn't know what, but I sensed there was an unknown unknown that would surface. If we could just keep the whole thing serverless in the Vercel environment, we'd be good. My agents found the way to do that and just did it. In minutes. We deleted a whole chunk of AWS cloud architecture.

When a Design Expert Walked Into the Room

This is where it gets into the architecture change that I think matters most. I was working on front-end layout and style, and I had loaded up front-end skills, framework skills, SEO skills. I was getting really good results, making cool improvements, asking for advice on layout and style. But I was never a great designer. I always had better success with things that were scientific, things you could measure. Empirical data, calculations, objective improvements. Subjective stuff was always harder for me.

I knew we were making progress. We were moving the needle forward incrementally. But we weren't having that leapfrog moment. We weren't having the kind of conversation that would be next-level in an area where there was room for transformation.

So I told my coding agent: do a web search. Find any design-focused skills in the community, Claude Code skills, OpenClaw skills. Not a developer who's an expert in tooling and code patterns and SEO, but somebody who specializes in the look and feel. What's going to pop, what's going to resonate with users. All those marketing terms I could never stomach. We don't have to use those words, but we need to understand what they're talking about when they use them.

When I loaded up those design skills, it was like a different person walked into the room. The advice it gave was advice I would have expected to hear from a human designer who had been doing design for years at a serious digital marketing firm. And it wasn't some bullshitter who learned all the buzzwords with only academic knowledge. I've been around enough of these people to know the difference.

There's a version of confidence that comes from competence. You can sense it. You can tell the difference between a person who has learned all the buzzwords and knows you don't know them, so they just deploy them with confidence, and a person who knows the buzzwords only because they're the easiest way to communicate principles they picked up through experience. Through a lot of failures, a lot of websites getting zero clicks, solid layouts that check all the boxes but just don't motivate people to convert. Only after going through that do you get an intuition about what's likely to work and what isn't.

The thing I pick up on with people who have that experience is that they can see it instantly. They can get behind the eyes of the users and just know what has a high probability of impact and what doesn't. They're not starting from zero on any given system. All that experience gets them 80 or 90% of the way there. They just need to button up that last bit with A/B testing, because that's the part that's constantly evolving.

That's what showed up when I loaded those skills. That battle-tested intuition. And if you combine that with my eyeballs, actual human eyes consuming screens, you've effectively added a digital marketing design expert to your team. Our company has that expert now. I didn't have to hire anyone, because I don't have the budget right now to hire a whole human to fill that role.

What Does the Agent Architecture Look Like?

The idea that came out of all this is to create agents-as-tools in my coding pipelines. My coding agents can tap a specialist agent and say, okay, bring in the design specialist. Bring in the SEO specialist. Bring in the QA specialist to help build test coverage. Multiple perspectives from the same model, forced to look at a problem through different lenses. I'm sure somebody else has thought of this and there's probably a term for it already. It has overlap with what people have been calling a "team of experts." A bunch of different agents with different specialties.

We also need a "call a human" tool in the pipeline. The agent needs to be able to send me a Telegram message with a link to a preview environment and ask what I think about the way something looks and feels. For architecture decisions, it could flag a pull request and say, hey, we need your review on this. Maybe a five-minute timeout. The point is that the pipeline knows when it needs human eyes and can request them.

The Human Role Is Shifting to Generalist

I want more humans on my team. I want friends that have my back. But until I can build up that traction and momentum, when I do hire, it won't be a marketing design expert first. It'll be someone with a more general knowledge of things, especially as this AI technology keeps advancing.

I think the humans you depend on will become more generalized. There's at least a phase we're going to go through where humans pivot into the role of the general part of artificial general intelligence. We'll move away from our specialties and be able to act as specialists in a large number of fields because the AI supplements the intuition, the perspective, the training, the experience. All we need to provide is the human experience. The photons on retinas. The gut reaction to whether something feels right. The embodied knowledge that a machine can approximate with high probability but might still get something subtly off about.

That's where this is all heading for me. My digital marketing work was relatively straightforward to automate. You ask Google and the analytics tools what needs to change, they tell you, you build a GitHub issue from the feedback, and you get coding agents to make the changes and open the pull request. The custom SaaS work is more complex, but the improvements I'm making along the way in the simpler pipeline are giving me the roadmap to automate the harder stuff. Every kink I work out, every specialist agent I build, every test suite I stand up is another piece of that roadmap.

Share this article

Help others discover this content by sharing it on social media