Mid-term Consequences of AI Assisted Coding

2024 and 2025 have seen a surge in available coding assistants that rely on LLM technology. There's even a new term for that: 'vibe coding'.

Well, looking back at roughly 30 years of coding and leading software engineering teams, I have my doubts about this. I tried it myself and was rather underwhelmed. But this is not what this post is about.

I'm concerned about the mid- to long-term repercussions continued use of LLM-assisted coding may have on the software industry as a whole.

It's Theft

Let's all be honest about this. Every single LLM vendor out there doesn't care about copyright, copyleft, content licensing, content ownership or anything like that. And here, "doesn't care" is a very kind wording that avoids much more strong idioms I'd like to use. They're just hungry for data. Any data.

In my perception, this is theft. They're stealing content they're not entitled to use, train their models with it, and create a commercial product out of it. In addition, they don't really give anything back to the communities they stole from. Not even attribution.

And we're not even getting into the complicated issue of copylefted material here and how token generation from an LLM can violate the GPL or not.

Industry Built on FOSS

Back in the 90s, people would silently smile at the thought of Free and Open Source Software (FOSS). Clearly, proprietary software was superior. Right?! If I look at how successful proprietary software is created today, it absolutely relies on huge amounts of FOSS. I've seen software projects with literally thousands of open source dependencies. The Java and JavaScript worlds in particular make super heavy use of it.

And everybody wins.

We, who create products, find well-written and well-maintained libraries and services for important functionality out there. A mistake is found and fixed, you update your dependencies (You do, right? You scan your deps for known vulnerabilities.), whop! problem fixed. Maybe a security issue was found and fixed, or a gain in efficiency helps every user of the library to be a tiny bit more sustainable. Other examples exist. In summary, our products get better by the work on tens of thousands of people out there.

The initial parties who open-sourced the code also get benefits:

The whole industry of software engineering is built on top of a huge amount of open source libraries and services. If you're working in that industry and are not at NASA, you know this.

Nobody Wins Anymore

Now, with code generation by LLMs, nobody wins anymore:

Consequences

One of the foundations of today's software industry will be severely damaged. Some will argue that this might be a good thing because it will be replaced by something better. I, personally, don't see it that way.

Remaking Mistakes and Security Issues

Companies using generated solutions instead of FOSS solutions built and maintained by an invested group of people and companies will recreate mistakes over and over again. Some of those will generate security issues in their solutions and no community will be there to fix it. This will become even worse with the code rot mentioned below.

This will increase the cost of software development when the introduction of LLM-based code generation was supposed to save money. And it will create more risk for all of us who use software every day and would like to trust it.

Code Rot

There will be less FOSS code out there because the publishers lose their benefit. That will lead to the LLMs not getting trained on modern code anymore. Already today, the code generated by an LLM trends towards old and outdated solutions. When working on an Android project, I had to constantly remind the AI to generate up-to-date solutions and even then, it didn't. APIs and other interfaces will evolve just as programming languages and maybe even paradigms will, but the machines will not know about it. Even if we'll see good code being generated in the near future (which I don't really believe) it will get worse over time.

It's somewhat ironic, but I really think that by mistreating copyright and terms of use introduced by licenses, the LLM vendors hurt themselves in the long run.

Don't Feed the Machine

As a rather seasoned individual working in the software industry, I don't want to feed the machine. I prefer to not pay for those vendors until they stop stealing code.

Oh, and I don't feel like open-sourcing anything I might be working on privately.

It has started.

What Else?

I firmly believe that LLM vendors should obey copyright laws, copyleft and other licenses when collecting material for their training. And when generating outputs give credits to authors of FOSS with permissive licenses. That would be huge step forward in my eyes.

Old Guy Disclaimer

Yeah, I know. Maybe I'm just too old. Could be. Only time will tell.