move fast, break everything forever?
Computer scientists are worried that thanks to ChatGPT, the next generation of large AI models may be in danger of never achieving liftoff.
For roughly the last two decades, the Silicon Valley model has been to move fast and break things. This is a good approach when building the umpteenth photo sharing or burrito delivery app you want to scale to millions of users because the faster it breaks and shows stress points, the faster you can fix it and rethink your architecture. It’s not quite as good when it comes to medical testing, or dealing with political tensions, or a pandemic, or democracy, or, as it turns out, unleashing AI on the internet.
You see, the web has been dying for a while now, costing us over $100 billion in spam, ad fraud, scams, and bot traffic that’s set to overtake human activity in less than two years at its current rate. LLMs like ChatGPT have precipitously accelerated the rate of decline by flooding the web with terabytes of slop and fake images every day. Which is, ironically, making it far more difficult to build new generations of LLMs and AIs that require vast quantities of data in their training sets.
This is a problem known in AI circles as model collapse, and it happens because math is a thing. As we’ve already discussed, there are limitations on what LLMs can do and it’s inevitable that they’re going to make wrong associations, lacking any conception of reality, fact, or causality. All those errors are now out in the wild, impossible to avoid when you’re doing the equivalent of sea bed trawling the web, and you’re now training your next model on erroneous information.
Unless you can effectively and efficiently sort AI-generated data from human created content, you’re going to keep recursively adding more and more errors. It’s kind of like teaching kids that 2 + 2 = 4.375913 again and again, then wondering why they tell you that 4 + 4 = 8.8 and as you keep adjusting the values to something even more wrong, the answers keep getting worse and worse, but on a web-wide scale.
In a sense, what we’re talking about is the AI analog to inbreeding. By building not on a wide diversity of information that survives fact checks, peer review, and is updated with the latest verified knowledge on a regular basis, but whatever other AIs spat out, you’re exacerbating the underlying errors until all you’re bringing in is garbage, and all you’re getting out is, predictably, also garbage.
And because AI slop can be generated millions of times faster, and blasted across the web millions of times further than high quality human created and curated work, a lot of computer scientists who work for AI startups put on a brave face and say they can figure this out, but the reality is that their bosses should’ve thought about this before reciting the mantra of Silicon Valley: “fuck it, ship it.” Now it’s too late. The web is kind of fucked for good unless we wipe out everything created after November 2022 other than some legacy news sites and Wikipedia, and start over.
I do have to admit, it’s been a very interesting to see how Nick Bostrom’s and so much of the Singularitarian movement’s aspirations met cold, hard reality. In their world, as AI ingests the sum total of the world’s knowledge, it will become an error-proof oracle which knows more than every human on the planet combined. Back on Earth, when AI ingested the sum total of all the knowledge we can offer it, it spat out mutilated parts and pieces of it like a toddler who spends a week with their edgy aunt and uncle, then returns with a fascinating new vocabulary.
Ultimately, the lesson in all this is that sometimes, the most innovative and paradigm-shifting course of action in today’s tech industry is to say “fuck no, don’t ship that” so you can take the time to consider the consequences of your plans first. ChatGPT was meant to generate immense hype and sell The Next Big Thing to everyone who has a credit card and internet access. It was extremely successful in that. And all it cost us was the future of AI training, the quality of the web we once used to enjoy, and utter and complete chaos in the job market. But fuck did they ship it, right?
Been a follower for quite sometime. Insightful and always on point.