A historic legal showdown in the AI arena

Encyclopædia Britannica and Merriam-Webster Sue OpenAI: Giant Clash Over AI Training Data

Last updated:

In a groundbreaking legal move, Encyclopædia Britannica and Merriam‑Webster have filed a lawsuit against OpenAI for allegedly scraping and using nearly 100,000 of their articles without permission to train ChatGPT. The lawsuit accuses OpenAI of copyright and trademark infringement, arguing that the AI‑generated outputs harm both their brands' reputation and revenue. As the tech community debates the implications, this case could set the stage for future AI data use and digital rights.

Banner for Encyclopædia Britannica and Merriam-Webster Sue OpenAI: Giant Clash Over AI Training Data

Introduction to the Britannica Lawsuit Against OpenAI

In a significant move within the intellectual property and artificial intelligence domains, Encyclopædia Britannica, Inc. and Merriam‑Webster, Inc. have initiated legal proceedings against OpenAI. The lawsuit, filed in the US District Court for the Southern District of New York on March 13, 2026, marks a pivotal moment as the plaintiffs accuse OpenAI of copyright and trademark infringement for allegedly using a large volume of their proprietary content to train its language model, ChatGPT, without authorization. This legal confrontation not only underscores the ongoing tensions between content creators and AI developers but also raises crucial questions about the ethics of data usage in technology development.
    The allegations center around claims that OpenAI's ChatGPT illegally scraped and reproduced almost 100,000 online articles, encyclopedia entries, and dictionary materials from Britannica and Merriam‑Webster. According to the lawsuit, this unauthorized use of copyrighted content has severe repercussions, as it not only diminishes the plaintiffs' online traffic but also poses real threats to their revenue and brand integrity. The plaintiffs articulate concerns that the inaccuracies occasionally produced by AI, often termed "AI hallucinations," could tarnish their longstanding reputation for accuracy, which has been meticulously built over more than two and a half centuries.
      Besides copyright transgressions, the lawsuit also addresses issues of trademark infringement. Britannica and Merriam‑Webster argue that the manner in which ChatGPT generates and presents its responses could mislead users into falsely associating the outputs with their prestigious brands. Such implications of false endorsement pose a risk to their brand's credibility and consumer trust. The case follows a similar suit filed against Perplexity AI, illustrating a growing pattern of legal challenges aimed at protecting intellectual property rights in the context of rapidly advancing AI technologies.
        The legal action seeks not only to halt OpenAI's usage of their content but also demands that the company compensates for any damages incurred, including the recovery of "illicit profits" gained from such practices. The outcome of this lawsuit could have far‑reaching consequences, potentially setting new legal precedents governing how AI technologies can interact with and use licensed or copyrighted content. As of now, OpenAI has yet to provide public comment on the lawsuit, which adds another layer of intrigue to the unfolding legal drama.

          Background: Key Details of the Lawsuit

          On March 13, 2026, the esteemed institutions of Encyclopædia Britannica, Inc. and Merriam‑Webster, Inc. initiated a high‑profile lawsuit against the AI giant OpenAI, as documented in the filing at the US District Court for the Southern District of New York. The lawsuit alleges that OpenAI illicitly utilized nearly 100,000 articles and entries from the plaintiffs’ extensive online resources, integral to their encyclopedic and dictionary offerings, to train its language model, ChatGPT, without obtaining appropriate permissions. This significant legal action underscores concerns regarding the unauthorized use of copyrighted content and its implications for intellectual property rights in the era of artificial intelligence, as highlighted in this report.
            The core of the lawsuit hinges on allegations of copyright infringement, where OpenAI is accused of violating the Copyright Act of 1976. Britannica and Merriam‑Webster claim OpenAI has scraped their websites to gather and utilize protected content for AI training. They assert that this extracted data was used to produce outputs that could closely mimic or summarize the original works, thereby redirecting audience traffic away from the authorised providers and towards the AI‑generated content. Such a shift exacerbates the potential financial and reputational harms to Britannica and Merriam‑Webster, who have cultivated a reputation for accuracy over centuries, now threatened by potential inaccuracies or omissions from AI "hallucinations," as explained in recent summaries.
              Apart from copyright infringements, the lawsuit also accuses OpenAI of trademark violations. It is alleged that ChatGPT's responses may mislead users into believing there is an endorsement or partnership between the plaintiffs' distinguished brands and the AI model's outputs. This misrepresentation could not only confuse users but also dilute the distinctive character of Britannica and Merriam‑Webster's trademarks, thus infringing on their rights and diminishing brand trust. Notably, the scope of the lawsuit mirrors that of a prior case filed by the same plaintiffs against Perplexity AI in September 2025, underlining a consistent legal strategy by these organizations to protect their intellectual properties against unlicensed AI training methodologies, highlighted in detailed accounts from several sources.
                In their legal proceedings, Britannica and Merriam‑Webster are demanding that OpenAI immediately halt any use of their copyrighted materials and seek monetary compensation for losses incurred and unjust profits gained by OpenAI. These demands reflect a broader intent to safeguard public trust in reliable information sources, emphasizing the need for ethical data usage in AI developments. As noted in their legal complaint, the plaintiffs stress that upholding accountability within emerging AI technologies is vital to preserving the integrity of content creators worldwide. This case not only champions their immediate grievances but also sets a potential precedent for future interactions between AI developers and content owners, steering the global dialogue on intellectual property in the digital age.
                  In the broader context of AI and legal precedents, this lawsuit could dramatically influence future dealings between AI companies and content creators, especially regarding the boundaries of "fair use" related to AI training datasets. Given OpenAI's lack of public response on the issue so far, as of March 16, 2026, industry stakeholders are closely watching how this legal confrontation unfolds, which could have widespread implications for the AI industry, calling into question existing and future practices for sourcing training data. For a perspective on similar cases and their potential influence, refer to discussions here.

                    Allegations of Copyright and Trademark Infringement

                    In a significant legal move, Encyclopædia Britannica, Inc. and Merriam‑Webster, Inc. have launched a high‑stakes lawsuit against OpenAI, marking another chapter in the tussle over AI and copyright infringement. The lawsuit, filed in the US District Court for the Southern District of New York, accuses OpenAI of unlawfully using content from nearly 100,000 of Britannica’s online articles, encyclopedic material, and Merriam‑Webster’s proprietary dictionary entries. These materials are alleged to have been scraped and utilized without permission to train OpenAI's ChatGPT, resulting in outputs that echo, sometimes verbatim, the original protected works. Such actions, according to the plaintiffs, have not only hurt their revenue but also their longstanding reputation built over centuries on the accuracy and reliability of their content.
                      The nature of the infringement is threefold: OpenAI is accused of breaching the Copyright Act of 1976 by first harvesting content from the websites of Britannica and Merriam‑Webster for AI training purposes. Secondly, by generating outputs that amount to reproductions or abridgments of the original works within ChatGPT, traffic to the original source material has been detrimentally affected. This, the lawsuit argues, devalues their core intellectual property and jeopardizes their business model based on subscription and licensing. Furthermore, their intellectual property has become entangled in AI "hallucinations," potentially tarnishing the brands with inaccuracies or omissions, impacts that extend far beyond mere financial losses, compromising the trust audiences place in Britannica and Merriam‑Webster.
                        Brand integrity is also under siege, with Britannica and Merriam‑Webster asserting that ChatGPT’s outputs falsely imply an endorsement or at the very least an association with these trusted brands. This kind of inadvertent trademark infringement is argued to deceive users, leading them to form an inaccurate impression that the AI‑generated information comes with the backing of these renowned institutions. This case also mirrors a prior legal battle they waged against Perplexity AI, displaying a consistent stance by Britannica and Merriam‑Webster against the unauthorized use of their content by AI developers.
                          While OpenAI has chosen a more reserved stance, with no public comment as of mid‑March 2026, the implications of this lawsuit stretch well beyond the immediate allegations. As various industries grapple with the rapid advancements in AI capabilities, the outcomes of such cases could set crucial precedents. They might define the boundaries of 'fair use' when it comes to AI training data, potentially forcing entities like OpenAI to pursue licensing agreements or face similar litigations in the future. This lawsuit, in particular, seeks not only to stop further unapproved usage of their content but also demands financial redress for the losses incurred and for "illicit profits" gained by OpenAI from such use.
                            The ramifications of this broader legal landscape extend to all content creators and AI entities. Should Britannica and Merriam‑Webster succeed, it could pave the way for more fortified protections for content creators, perhaps leading to an era where licensing becomes a standard practice for access to all training datasets. As established by the legal community, particularly in the backdrop of ongoing cases like The New York Times v. OpenAI, the question of how AI technologies acquire and use vast amounts of data stands at a contentious intersection of legality and innovation, with courts poised to critically evaluate the balance between technological advancement and respect for intellectual property rights.

                              Comparative Analysis: Similar Lawsuits in Industry

                              The lawsuit filed by Encyclopædia Britannica, Inc. and Merriam‑Webster, Inc. against OpenAI, alleging copyright and trademark infringement, reflects a broader trend in the tech industry regarding AI and content usage. This case mirrors prior legal actions, such as the September 2025 lawsuit by the same plaintiffs against Perplexity AI, illustrating a pattern where traditional publishers seek to protect their intellectual property from AI companies utilizing large datasets for training without explicit permission. Such cases highlight the ongoing tension between content creators and the evolving realm of AI technology.
                                The legal landscape surrounding AI and copyrighted content has been shaped by several high‑profile lawsuits, signaling potential shifts in industry practices. The Encyclopædia Britannica lawsuit is reminiscent of cases like The New York Times v. OpenAI, where similar claims were made about unauthorized data usage for AI training purposes. Additionally, the involvement of other industries, such as music with lawsuits from labels like Universal Music Group against AI‑generated music platforms, demonstrates a widespread push against the unlicensed use of copyrighted material. These lawsuits collectively underline a growing demand for legally sanctioned data usage practices to safeguard intellectual property rights.
                                  Comparative analysis suggests these lawsuits serve a dual role: they are not only reactive measures to perceived violations but also strategic efforts to influence future AI content licensing norms. As evident from past and ongoing cases, such as those involving Getty Images, the outcomes of these lawsuits could define the boundaries of fair use in AI training, potentially leading to mandatory licensing agreements and influencing how AI companies source, manage, and utilize content. The implications of these legal battles may extend beyond immediate financial settlements, potentially reshaping how AI models are developed and the ethical considerations surrounding their data inputs.

                                    Public Reactions: Support, Criticism, and Neutral Views

                                    The legal battle between Encyclopædia Britannica and OpenAI has sparked a wide array of public reactions, reflecting the broader divide over how artificial intelligence should be integrated with existing intellectual property laws. Supporters of Britannica argue that the lawsuit is an essential step in protecting intellectual property and ensuring the survival of traditional sources of knowledge. These proponents often stress that platforms like Britannica have been the bedrock of reliable information for centuries, and it's crucial to preserve this legacy against the backdrop of modern technological advances. A fervent discussion on social media platforms like X (formerly Twitter) highlights a pervasive sentiment among users who believe Britannica’s legal action is a justified stand against what they label as 'digital robbery.' One popular tweet encapsulated this view, praising the lawsuit for challenging what it described as AI systems exploiting established knowledge bases without giving due credit or compensation.Read more here.
                                      Conversely, critics of the lawsuit perceive it as an impediment to innovation, contending that it misguidedly challenges the transformative nature of AI at a time when technological evolution is imperative. Many technology enthusiasts argue that attempting to restrict AI by applying traditional intellectual property frameworks is akin to stifling potential breakthroughs in AI applications. In tech‑centric forums and discussions, such as those on TechCrunch and Hacker News, there is a prevailing belief that OpenAI and similar companies’ use of public web data for training purposes constitutes fair use, which should not warrant legal penalties.See the analysis here. This perception is backed by comparisons to prior landmark cases in which courts have sided with innovation over rigid interpretations of copyright laws.

                                        Implications for AI and Publishing Industry

                                        The lawsuit filed by Encyclopædia Britannica against OpenAI marks a pivotal moment that could redefine the intersection of artificial intelligence and the publishing industry. If the courts side with Britannica, it may lead to a significant overhaul of what is currently considered 'fair use' in the context of AI training. As reported in PYMNTS, OpenAI's alleged unlicensed usage of nearly 100,000 Britannica articles raises pressing ethical and legal questions about data scraping and AI training. A court decision favoring Britannica could necessitate that AI companies negotiate licensing deals with content creators, which may dramatically increase operational costs, potentially by billions annually.
                                          A favorable ruling for Britannica may also encourage other content publishers to file similar lawsuits, leading to a surge in legal actions that advocate for the establishment of new licensing norms. This could create a paradigm shift where AI companies might be compelled to adopt synthetic data or enter into explicit partnerships to avoid legal entanglements. According to Times of India, such licensing deals may become an industry standard, reshaping how AI tools access and utilize data.
                                            The potential financial implications for AI companies could be profound, as they might be required to pay significant royalties, diverting funds from innovation to legal compliance. On the flip side, content creators, particularly smaller entities, risk being sidelined in negotiations dominated by larger players. However, this could also lead to the growth of a robust market for AI data licensing agreements, contributing to new revenue streams for publishers as highlighted in TechCrunch.
                                              For the publishing industry, this scenario presents both challenges and opportunities. Traditional models might suffer from reduced web traffic as AI‑generated summaries replace visits to original sources, but they could also benefit from a new collaborative model with AI companies if licensing and revenue‑sharing mechanisms are established. Additionally, successful lawsuits could enhance publishers' revenue through damages or settlements, contributing to a sustainable economic model that values intellectual property more robustly, as suggested by industry analyses in sources like ABC News.

                                                Future Legal and Regulatory Landscape

                                                The ongoing lawsuit between Encyclopædia Britannica, Inc., Merriam‑Webster, Inc., and OpenAI presents a crucial turning point in the legal and regulatory framework governing artificial intelligence (AI). As of early 2026, we're witnessing an escalation in legal actions against AI companies, focusing on unauthorized use of copyrighted content for AI model training. The allegations from Britannica and Merriam‑Webster underscore a critical legal issue: whether the unlicensed use of textual data by AI systems like OpenAI's ChatGPT violates established copyright laws, specifically the Copyright Act of 1976.
                                                  The outcomes of such lawsuits are likely to influence the future legal landscape significantly. If the courts rule against OpenAI, it could set a precedent for requiring licensing agreements to protect intellectual property used in AI training. This could increase operational costs for AI companies as they may be forced to enter into expensive licensing deals to access high‑quality training data. Such a shift may slow innovation as it restricts access to vast datasets previously considered open for AI model development.
                                                    Regulatory bodies in the U.S. and the EU are likely to monitor these cases closely, possibly leading to new legislation that clarifies the boundaries of 'fair use' in the context of AI. Legal experts predict that this case could influence international data‑sourcing norms and enforcement, particularly in jurisdictions like the UK and India where similar tensions over AI data use are emerging. This legal environment underscores a growing demand for ethical AI development frameworks that balance innovation with creators' rights.

                                                      Recommended Tools

                                                      News