“AI Stole Our News”: Media Outlets Push Back Against Generative AI Training
The Fight Over Who Owns Digital Content Has Begun
Global media companies are raising the alarm. According to an April 2025 Chosun Biz report, top-tier news organizations like The New York Times, CNN, and The Washington Post have accused AI companies of unauthorized use of their content—including articles, images, and videos—to train large language models. The term they’re now using: AI theft.
“We Provide the Content, They Profit from It”
The core complaint is simple yet serious:
“AI models are being trained on our high-quality journalism—without consent, and without compensation.”
Generative AI companies have traditionally relied on scraping data from publicly available sources online, defending this practice under the doctrine of “fair use.” However, media organizations argue that this has eroded the rights of content creators, making their business models unsustainable.
In response, some publishers have filed lawsuits, while others have deployed technical tools to block AI crawlers from accessing their content.
What Is Fair Use—and Is AI Abusing It?
The legal gray zone lies in how fair use is interpreted. While AI developers argue that large-scale text and image scraping falls within legal bounds, publishers maintain that such interpretation:
-
Fails to respect intellectual labor,
-
Facilitates commercial exploitation, and
-
Undermines journalism by cannibalizing attention and traffic.
This is not merely a philosophical disagreement. In 2024, The New York Times sued OpenAI and Microsoft, alleging massive copyright infringement. The case may set a global precedent.
A Content Crisis: Journalism vs. Generative AI
Behind the legal drama is a deeper structural problem. Journalism thrives on original reporting, which is expensive and resource-intensive. Generative AI, on the other hand, is being deployed to generate derivative content at low cost, often drawing directly from these original sources.
This has created a paradox:
The more accurate AI becomes, the more it undermines the very content producers it depends on.
Some media analysts have compared this to music streaming in the early 2000s—except this time, there’s no licensing system yet in place.
What Are Publishers Doing to Defend Their Rights?
As AI continues to reshape the digital economy, media companies are adopting a range of defensive strategies:
-
Filing copyright lawsuits in multiple jurisdictions
-
Using robots.txt blocks to stop AI web crawlers
-
Negotiating licensing deals with select AI developers
-
Lobbying for AI-specific copyright reform
The core goal? A future in which AI-generated content can coexist with journalism—through fair compensation, attribution, and mutual respect.
Why It Matters for the Future of Content
If left unchecked, the unchecked training of AI on journalistic content could lead to a collapse in independent reporting, a weakened information ecosystem, and unchecked misinformation generated by machines.
What’s at stake is not just money—it’s credibility, truth, and the sustainability of knowledge production.
🔗 Reference / Attribution
This article is a summary and analytical rewrite based on the April 10, 2025 article in Chosun Biz:
“Generative AI Learns from News Without Permission – Media Giants Say ‘This Is Theft.’”
(Original copyright belongs to Chosun Biz. This adaptation is intended for informational and fair use purposes.)
📝 Need Help Licensing Your Content to AI Platforms?
At Koo & Artelex, we offer legal advisory services for artists, journalists, and creators navigating the AI era. Contact us to learn how to protect your content and negotiate fair AI licensing terms.