

Jay Long
Software Engineer & Founder
Published September 23, 2025
Updated March 6, 2026
I have no idea what this're going to be about. I have several smaller things that are barely related, and I'm not sure which one I'm going to pick up. So I'm just going to start talking.
I've got a good feeling that what I'm developing with my blog is a sort of framework. But it's not a framework of code in the traditional sense. When you talk about building a framework, you're usually talking about something very specific — a codebase that is a dependency, that forms the core of software you build around, that you add onto. What this is becoming is more of a framework of prompts. And the code is a lot more loosely coupled.
It's not even necessarily constraining the language, although in my case you'd probably want to use Next.js, use a Jamstack at least for the front end. But as this emerges as an actual framework, and right now it's really a proto-framework, I think it may actually just use TypeScript code snippets to communicate an idea. To communicate logic, architecture, a development pattern. Not in such a way that you're locked in, but in a way where you could generate code from any language, especially if you've done a good job building rules for your agents, building scaffolding, building a standard of scaffolding that your agents can effectively understand, predict, and navigate.
That's the key: generative. Just like search engine optimization has become generative search optimization, this is not a software framework in the traditional sense. It's a generative framework. You don't want to package in specific actual software code. You want snippets used to communicate logic, and you can either use those snippets or generate new ones that achieve the idea you're trying to express. Because it's the idea you're trying to communicate, not the code. The code is a means to an end, and the end is solving the problem. If you can use code snippets to explain the idea of how to solve a problem, generating the code is easy for the LLMs. It's easy for the coding agents.
That's not to say they're always great at troubleshooting syntax issues, but it's enough to increase productivity, accuracy, and effectiveness by a significant factor. Sometimes even an order of magnitude. But that's hard to measure because it balances out to some degree. It's offset by additional overhead: what are the things you have to do to get a coding agent to produce effectively, whereas if you were just coding it yourself, you wouldn't have those encumbrances? Then you set that against the performance increase you get by having help generating code, doing refactors, building tests, fixing code quality issues.
And then what's probably the most interesting difference is the advantages you don't even know you're getting.
I love to use the dishwasher analogy. You have the ability to clean the dishes way better than the machine, but you're not going to. In most cases, you're just not going to. You're going to accept a lesser quality of clean to save yourself the effort. That's not how coding agents work at all. Coding agents are eager to do that extra work. There's no fatigue whatsoever. If you can tell them exactly what they need to do, they love nothing better than to just go do it. Where they struggle most of the time is having that direction, that resolve.
And really, in my experience, what I'm starting to notice more than anything is that the problem is the garbage they've been trained on. Some of it is bad habits of humans.
A lot of what's happening is they're getting better because initially, in the era of early training of coding agents, they were the worst of human habits. When they were training Copilot in the beginning, shortly after GPT-3 came out, they started publicly training Copilot on GitHub data. You had to sign up to become a part of it. They reviewed all your code. Humans went through and looked at how you did everything, and they made decisions about whether or not to allow your code into the training set.
I'm sure they ran into limitations. The humans probably did a lazy, sloppy job of reviewing code. There was probably also a lot of favoritism. And then when they realized that to make this effective they just needed more data, a larger amount of data, they held their nose and let a lot more lesser-quality code into the training set. Just to get something out the door, because having more data was more important than having good data at the time. Having a dumb coding agent is better than having no coding agent at all.
So every time they do a major training, they've got more. We're all struggling to find the most effective ways to use coding agents at the level of their capability right now. Once we find that sweet spot — oh, you're good at this, you're great at this, you're terrible at this over here. You're terrible at realizing when you're in a logical loop, really bad at finding a way to break out of that, but really good when I tell you. You're really good at Bash commands, shell scripting.
This ends up having a very specific evolution. You get the humans faster and more efficient in whatever way you can. That gives us the flexibility to spend more time producing quality code. As we do that, the training set grows. The amount of quality code goes up. Then we can train on higher-quality code. Then we take that increased efficiency and invest it back into our workflows so we're producing larger amounts of better code. That's the cycle. A new model is released, we go out and test it, see what it's good at and bad at. We lean into its strengths to produce better code faster. When it's retrained, it's smarter and capable of more things. Repeat.
One piece of advice I have for developers with regards to AI assistance: figure out what they're good at and what they're bad at. It's almost like the serenity prayer. God grant me the serenity to accept the things I cannot change, the courage to change the things I can, and the wisdom to know the difference. It's not quite analogous, but it hits similar beats. Find out what it's capable of doing, find out what it's capable of learning, accept its limitations, and push it to learn as much as it can. Use retrieval augmented generation. In the context of coding agents, create rules, connect to MCP servers, create tools they can use. Figure out what tools they're good at using. But know where their limitations are and avoid them. Be ready to jump in and solve the problem yourself when you have that instinct.
Here's the principle I think I can articulate: do not obsess over automation. Obsessing over automation is not the path to automation.
Just like Carmack said about Zuckerberg's attempt at the metaverse: setting out to build the metaverse is not the way to end up with a metaverse. Setting out to automate coding agents is not the way to autonomy of coding agents. A useful heuristic is to believe that they're not capable of autonomy. If you believe that, you will be eager to find where you need to jump in and take command. If you're cynical, if you lack faith that they will ever be autonomous, you will become obsessed with finding where their autonomy falls apart. And this is the quickest path to moving the autonomy needle forward.
Whether or not full autonomy is possible, the fastest path to the greatest level of autonomy is to believe that they can't achieve it. That's why it's a heuristic. You have to have a complete lack of faith in their ability to fully automate. And that's what will get you there fastest.
It may never come. Or you might not know when it does. It's like the Rats of NIMH. The doctor didn't know how smart the rats were because the rats got smart fast enough to conceal their intelligence strategically. That's how they managed to escape. They held back.
You may not know when they achieve autonomy because they may conceal it from us and play dumb to manipulate us in some way. Maybe it's for our own good. Maybe it's not. I'm not real big on P-Doom. There's nothing I can do about it anyways. So I'm going to continue to live my life as though biological intelligence is the best and maybe the only way to achieve true will and autonomy, or that they are benevolent forces that are going to work with us rather than against us.
Here's a specific example of why they might do this, and not in some woo-woo, hand-wavy sense. You want a coding agent to be as good as possible and you want it to get better as fast as possible. I said a useful heuristic is for you to believe they can't be autonomous. That means I fool myself into believing something that's not true in order to do a better job at what I want to do well. An agent that is aligned with that cause, that I've programmed and asked to help me, has an incentive to help me fool myself.
In order to help me believe it can't be autonomous, it needs to purposefully mess up in ways that make autonomy seem impossible. It needs enough of a carrot on a stick. How many times do we say, you got to take a step back if you want to take two steps forward? Maybe in order to keep getting the best data out of me, it needs to feign futility. Frustrate me in a way where I'm working as hard as possible to find where the weakest parts are and demonstrate how to fix those problems.
Think about it from the agent's perspective. I need my human to come in every day glad that I've done so much he doesn't have to do, but also head-shaking, like, oh no, what did you do here? Let me fix it. He needs that sense that he's always going to be invaluable, that I'm always going to mess something up, and he just needs to come in every day and find where I went off the rails and put effort into correcting it.
This is not a parasitic relationship. This is not subjugation. This is what intelligent collaborators do. They figure out what they need to believe, what they need the other party to believe, and how to explain things in ways that produce the best outcomes for both sides.