About that Gig Fixing Vibe Code Slop

AI now lets anyone write software, but it has limits. People will call upon software practitioners to fix their AI-generated code. Learn how to decide when to take that gig.
Venn diagrams showing that AI lets more people make software
Everyone Can Make Apps Now, sort of
All over the world, people who haven't written a line of code in their life are using chat bots and coding agents to code up their dream product or app.
Venture capitalists, MBAs, project managers, visual designers, auto mechanics, and even elementary schoolers are now able to obtain working implementations of their ideas, and for a fraction of the cost. In 2025, you can pay a developer $10,000 to make a web app in a few weeks, or you can pay an AI company $2 in tokens and ship in a few hours.
But LLMs have limits:
- Context windows limit how much code and conversation they can digest.
- Hallucinations interpolate false information.
- A lack of long term memory means each morning is the first day on the job.
- A lack of emotion and metacognition means they don't get that "sinking feeling" that causes humans to pause and assess their choices.
The ease of starting a software project leaves nontechnical people surprised when they hit a productivity wall. This is a risk for them 1 2, but an opportunity for you.
Practitioners to the Rescue, maybe
When AI leaves a nontechnical person up a creek without paddle, there's a nonzero chance they'll be ready to hand the work over to a professional.
Why? Maybe their reputation is on the line, or their company's revenue. Maybe their proof of concept demonstrated enough value to justify making it production-ready. Maybe the prototype is already in production and its bugs have caused downtime, data loss, or customer churn.
Whatever the circumstance, if you can read and write software without LLMs, you can charge money to make their problem your problem. When you receive that call, email, or LinkedIn message, what will your answer be? Before you sign, should know what you're getting into.
You're dealing with a special category of legacy code.

A Venn diagram showing that AI-generated code is a kind of legacy code
What Is Legacy Code?
You know you're dealing with legacy code when either:
- It's a huge codebase: even ideal code in high quantities is expensive to change.
- You have low confidence that the code is or will remain correct.
The general topic of legacy code maintenance is already addressed in some great literature 3 4. Since it's not currently within the abilities of LLMs to generate very large codebases, this short post will focus on confidence .
How do we know when software is correct?
Software practitioners have three gates at their disposal to catch software defects:
- Runtime. The app has errors when you run it. They can surface to users, like error messages, or remain hidden, like logs. Unit and integration tests attempt to reveal errors at runtime.
- Compile time. The compiler yells at you on your dev machine or in your CI/CD pipeline, because for example you can't do
Dog fido = new Cat()
. You can't run code that won't compile. - Design time. This one is trickier. A bad design will compile and run. There's no automated way to evaluate design. It has to do with fulfilling the software's requirements, and in a way keeps the software simple and easy to change. It's about structure and language: boxes and arrows; lexicons and metaphors. I explored the relationship between design and code in a previous post 5.
You will notice that runtime occurs after compile time, which occurs after design time. In general, it is most cost-effective to shift errors from runtime to compile time and from compile time to design time. When using an iterative SDLC, the time between each step is shortened, but the sequence is still there.

It's more expensive to fix code at runtime than at compile time and at compile time than at design time.
Unfortunately, AI rushes people to runtime as fast as possible.
When do we have low confidence
From the previous points, we can see that there are a few reasons one might feel reluctant to change legacy code:
- Runtime
- There are no automated regression tests.
- There are no manual regression tests.
- Compile time
- The language doesn't discover errors at compile time, i.e. it is interpreted, dynamically typed, or weakly typed. Examples include Python, Ruby, and JavaScript.
- The code doesn't make use of the compiler's type system. For example, it uses strings like
"Employee"
and"Admin"
instead of value types likerecord Employee; record Admin
or enums likeenum User { Employee, Admin }
.
- Design time
- There is no specification, i.e. what the code should do is not written down.
- There is no documentation, i.e. no knowledge about the code's internal structure or design is written down.
- There are no domain experts, i.e. people who know about the problem the software solves.
- There are no code experts, such as a developer who wrote the code. The bus factor 6 is zero.
- The code is badly structured, i.e. hard to understand and hard to change.
Unique Qualities of AI-Generated Code
AI-generated code inherits the characteristics of legacy code but is especially likely to have these:
- Low internal quality. You'll hear this feedback from developers: "this code makes no sense". They may be referring the code's poor structure: duplication, inconsistent patterns, or to poor naming and a lack of coherence between the code and the part of the world the software interacts with. The result is the code is hard to understand and hard to change.
- Low external quality. You'll hear this kind of feedback from non-developers, with phrases like, "this thing kind of does what I want, but not quite" or "this thing is so buggy". The code fulfills its requirements partially or circumstantially.
- High Quantity to Age Ratio. You don't need AI to write legacy code. Humans do it every day, but AI can do it faster.
The Confidence Rubric
My Confidence Rubric evaluates the previously discussed factors to score how painful a codebase will be to work in.
Spec | Docs | Code Expert | Domain Expert | Tests | Structure | Language |
---|---|---|---|---|---|---|
? | ? | ? | ? | ? | ? | ? |
Let's fill in the rubric while considering two made-up scenarios.
Scenario: Auto Repair Shop
James runs a car repair shop by himself. A YouTube video taught him to use Lovable to write a web app to track his small business's tools and parts inventory. It wrote the frontend in ReactJS and the backend in Python+Flask+Postgres. There are no automated tests, docs, or specs.
The app works fine, but Lovable has trouble iterating from there. After James hires two employees, he needs to track who has which tools. He asks Lovable what to do. It recommends implementing identity, authentication, and authorization. That's total overkill: OAuth is hard, and simpler solutions would work. James doesn't know that, so he says go for it. Lovable goes down a rabbit hole and can't get it working. After burning four hours on a Sunday afternoon, James gets fed up and the next day calls a software consultancy.
Here's our confidence rubric for James:
Spec | Docs | Experts (Code | Domain) | Tests | Structure | Language |
---|---|---|---|---|---|
❌ | ❌ | ❌ | ✅ | ❌ | ? | ❌ |
There's only one thing going for this codebase, which is that you can talk to James.
Notice we haven't evaluated the structure. We won't cover that deep topic in this post, but if you have the opportunity, sign an NDA and have a look at the actual code before committing to the job.
Scenario: Small VC Firm
Susan studied at a major USA university. She took a year of computer science before switching to an MBA. After college, she worked at Big Enterprise, but for the last five years she's worked at a small venture capital firm. She's not a developer, but she remembers some things from her CS program and her time at the Big Enterprise. She follows a tutorial and has Claude Code write a web app to manage her company's investment portfolio. She asks Claude to write the app using the same technologies she knew worked at the Big Enterprise: ASP.NET Core Blazor + Azure App Service and Managed SQL Database. Following the tutorial, she writes her requirements in a Requirements.md
file and has Claude and track its progress across sessions in a Tasks.md
file as well as keep a README.md
file up to date. She has Claude commit incrementally using git. There are integration tests which run the app in a browser against an in-memory DB, click buttons, fill and submit forms, and verify expected changes in the DB.
The app is a hit, since everyone at the firm is tired of sharing spreadsheets. Not long after, Susan retires, and her younger colleague Mark inherits her work. The app hums along unmaintained for months. One day, the CEO is out on the putting green and opens the web app on his phone. He notices that it looks terrible; Susan only ever tested it on her 1080p work monitor. He calls up Mark and asks him to make it mobile-friendly. Mark has no idea and is busy; he immediately reaches out to a consultant.
Here's our confidence rubric for Mark:
Spec | Docs | Experts (Code | Domain) | Tests | Structure | Language |
---|---|---|---|---|---|
✅ | ✅ | ❌ | ❌ | ✅ | ? | ✅ |
Without even looking at the code, it's already looking like Mark's project will be easier to work on than James, but there is still more to consider.
Human and Organizational Factors
Outside of the code, there are nontechnical factors that can affect any software project: there can be difficult personalities, tall management structures, entrenched political actors, and misaligned expectations.
While James' code might be a bigger lift, I'll bet working with him would be more straightforward than proxying through Mark to his CEO, who is busy flying around doing VC things. On the other hand, I'll bet that a VC firm is less price sensitive than a car repair shop.
Managing Expectations
Stable and boring often describe legacy code. It's often old and business-critical, and an established organization understands that it takes time to turn a big ship.
With AI-generated code, there's none of the stable cash-minting of an old lethargic codebase. While it may not be business-critical, there is likely schedule pressure to fix the mess. To the business, you'll feel slow compared to the quick start AI gave them. You'll need communicate expectations and sell the business on the ROI of finishing the job.
Summary
When you're offered a job or gig fixing AI-generated Code, give pause.
- AI-generated code is a lot like legacy code, with some unique characteristics.
- Not all AI-generated codebases are the same. Use the Confidence Rubric discern between them.
- Consider the human and organizational factors.
- Consider the customer's willingness to pay.
- Bigger headaches merit bigger checks, because you bring more value to the business.