The Impact of AI “Model Collapse” on Human Expression
Written on
The Rise of AI-Generated Text
I've noticed an increasing amount of AI-generated content online, including on platforms like Medium. Recently, I've received comments on my articles that seem strangely detached. While they are grammatically correct, they often lack depth and personality. Typically, they consist of generic praise (“this is great”) followed by a bland summary of the article’s main points (“this piece discusses [X, Y, and Z]”). These responses lack the unique insights and humor one would expect, coming across instead as dull recaps.
I haven’t thoroughly investigated these comments, but they do seem to suggest that someone might be employing a large language model to automatically generate feedback on Medium.
What could be the motivation behind this? It could be a programmer experimenting or perhaps someone creating a network of bot accounts designed to simulate normal online interactions for ulterior motives—an approach often seen in social media manipulation.
This phenomenon contributes to the growing presence of monotonous AI-generated language on the internet. A simple search for phrases like “as an AI language model” or “regenerate response” reveals a plethora of blog posts, tweets, and user reviews that exhibit these signs. Many bloggers openly confess to using AI to generate content for SEO purposes, while Reddit users have utilized it to formulate comments.
Understanding “Model Collapse”
Recently, I came across a thought-provoking academic paper discussing the concept of “model collapse.” This term refers to the complete degradation of AI language models when they are trained on outputs from other models.
Historically, companies such as OpenAI and Google have trained their AI systems using content created by actual humans. Although they haven’t fully disclosed their training materials, it is likely that their datasets include a wide range of internet content, including Wikipedia, Reddit posts, books, and manuals. Crucially, they also rely on text generated by teams of human trainers during a vital reinforcement-learning phase. The goal is to capture the intricate patterns of human language.
However, as the paper indicates, problems arise when the internet becomes saturated with AI-generated content. This situation could lead to future AIs being trained at least partially on the outputs of previous AIs.
Such a scenario raises significant concerns, as newer models may carry forward the biases and flaws of their predecessors. The researchers found that even a small percentage—around 10%—of AI-generated data in the training set could lead to bizarre and incoherent outputs.
Preventing Model Collapse
To mitigate the risk of “model collapse,” it seems that AI companies are already contemplating various strategies. One approach involves maintaining a high-quality, human-produced dataset that remains untainted by AI-generated content. This dataset could then be used for periodic retraining or complete refreshes of the model.
Another method to enhance response quality and minimize unwanted errors is to reintroduce fresh, human-authored datasets into the training process.
The implications of model collapse highlight the enduring value of authentic human-generated content.
The Value of Human Creativity
On one hand, the emergence of large language models has sparked numerous discussions about the future of human creativity. Concerns often revolve around two main issues: the fear that AI could diminish the market for creative work (who would pay for writing if a bot can do it for mere cents?) and the anxiety that AI might dampen the motivation to write (why bother, if an AI can deliver similar results effortlessly?).
While there are many other concerns regarding AI—ranging from ethical considerations to its role in misinformation—let's focus on these two aspects for now.
In terms of the market and the desire for original human expression, model collapse suggests that genuine human writing will continue to hold substantial value, at least from an industrial perspective. However, it’s important to note that I can't guarantee a financial windfall for human creators. Today's monopolized markets often favor established players, meaning that everyday individuals might not see their contributions adequately rewarded. Historically, this pattern has persisted for thousands of years, where human creativity is both essential for societal flourishing yet often economically overlooked.
Ultimately, it appears that even AI systems rely on human creativity to function effectively.
(If you appreciated this essay, crafted entirely by a living human being, feel free to hit that “clap” button! You can express your support up to 50 times per article!)
Author Bio
I am a contributing writer for the New York Times Magazine, a columnist for Wired and Smithsonian magazines, and a regular contributor to Mother Jones. My works include "Coders: The Making of a New Tribe and the Remaking of the World" and "Smarter Than You Think: How Technology is Changing our Minds for the Better." You can find me on Twitter and Instagram as @pomeranian99 and on Mastodon at @[email protected].