New York Times Sues OpenAI and Microsoft for Copyright Infringement
Summary
The New York Times filed a landmark copyright lawsuit against OpenAI and Microsoft, alleging that millions of its articles were used without permission to train AI models including GPT-4. The suit, seeking billions in damages, became the most significant legal challenge to the AI industry's use of copyrighted training data.
What Happened
On December 27, 2023, The New York Times filed suit against OpenAI and Microsoft in the U.S. District Court for the Southern District of New York. The complaint alleged that the defendants had used millions of Times articles — acquired through web scraping — to train their AI systems, including ChatGPT and Microsoft's Copilot, without authorization or compensation.
The lawsuit presented evidence that ChatGPT could reproduce Times articles nearly verbatim when prompted, and argued that AI-generated summaries of news articles were already substituting for visits to the Times website, threatening the company's subscription-based business model.
The Times sought statutory damages of up to $150,000 per instance of infringement for "millions of works," which could amount to billions of dollars. It also demanded the destruction of all AI models and training data that incorporated Times content.
The suit came after the Times terminated negotiations with OpenAI over a licensing agreement. Other publishers, including the Associated Press and Axel Springer, had reached licensing deals with OpenAI, making the Times' litigation route a notable exception.
Why It Matters
The Times lawsuit became the flagship case in a growing wave of copyright litigation against AI companies, and it crystallized the fundamental legal question underlying the entire generative AI industry: does training an AI model on copyrighted material constitute fair use?
The case was significant for several reasons. The Times was by far the most prominent and well-resourced plaintiff to date. Its evidence of near-verbatim reproduction was more compelling than arguments based on abstract "style" copying. And the potential damages, if the court ruled against OpenAI, were large enough to threaten the economic foundations of the training-on-everything approach that powered most frontier AI development.
The lawsuit also exposed a tension in OpenAI's own position: the company simultaneously argued that using copyrighted data for AI training was fair use while also signing licensing deals with some publishers — implicitly acknowledging that the content had economic value worth paying for.
The outcome of this case, still pending as of early 2026, has implications far beyond the parties involved. A ruling against OpenAI could force a fundamental restructuring of how AI models are trained; a ruling in OpenAI's favor could effectively authorize the appropriation of the internet's creative output without compensation.