It’s well known that generative AI systems are trained on massive amounts of existing data. Without such data, AI wouldn’t be able to create anything, including text, photographic images, or even dialogues. The Internet offers tremendous volumes of training materials for AIs to learn from, but what if access to this data was suddenly stripped? Increasingly, generative AI legal problems seem to be mounting one lawsuit at a time, with OpenAI and the New York Times clashing in the courtroom. How this case and others are resolved will play a significant role in the future of generative AI.
In the past months, OpenAI and others have been sued for violating copyright laws and intellectual property violations. These lawsuits had previously involved various authors, publishers, artists, and photographers. But now, major news media companies are throwing their hats into the lawsuit ring against generative AI. As such, generative AI and copyright law violations are moving to the forefront. It remains to be seen whether high-level judicial rulings will be required or if things are settled out of court. But it’s clear either way that generative AI legal issues are mounting quickly.
“Copyright will be one of the key points that shapes the generative A.I. industry.” – Fred Havemeyer, Financial Analysts, Macquarie Research firm
The Basics of Copyright Law
In order to understand generative AI legal issues, it’s necessary to appreciate copyright and intellectual property protections. Naturally, such protections exist to ensure that creators of various industries benefit from their efforts. That doesn’t mean that such content cannot be used at all. But there are safeguards in place to prevent someone from usurping someone else’s materials. Under the fair use doctrine, new creators can build upon existing copyrighted materials. But in such cases, the content must be substantially transformed. Likewise, the new creator cannot use the new materials to compete in the same field of interest. This is where generative AI and copyright law enters into a gray area.
As noted, generative AI utilizes an abundance of copyrighted materials during its training. But because these materials are so vast, it’s unclear if newly created generative AI content applies to a single entity. Certainly, anything generated is likely to be significantly altered, fulfilling the first part of the fair use doctrine. However, the generative AI content does often compete in similar markets as original works. Ultimately, these generative AI legal issues will likely boil down to copyright material access is permissible. But this extends beyond basic intellectual property protections. This is why legal resolution may be required to address generative AI and copyright law interpretations.
“There isn’t a clear answer to whether or not in the United States that is copyright infringement or whether it’s fair use.” – Ryan Abbott, Attorney, Brown Neri Smith & Khan
Major News Media in the Mix
One of the caveats of fair use doctrine involves substantive change of original content. Interestingly, a recent lawsuit that was filed against OpenAI and Microsoft Bing Chat alleges this didn’t occur. The New York Times claims that content produced by generative AI is nearly identical to its own published content. As a result, the Times claims that OpenAI and Microsoft are violating their intellectual property rights. Using the substantive products and materials of the Times without permission or payment lies at the heart of the generative AI legal issues. This represents the first time a major news media has pursued generative AI and copyright violations in court. As such, this represents a progression in an already contentious area of concern.
Interestingly, some suggest that the New York Times may simply be pushing the envelope in an effort to gain leverage. Other media outlets have already secured data licensing agreements lately. For example, the Associated Press as well as Politico and Business Insider have reached terms with OpenAI. By filing a lawsuit claiming generative AI and copyright law violations, the Times could be better positioning themselves. If judicial rulings move toward media’s favor, then the price tag for data licensing from the Times will increase. Thus, generative AI legal issues could be serving as a bargaining chip rather than a desire to clarify copyright law.
“Ultimately, whether or not this lawsuit ends up shaping copyright law will be determined by whether the suit is really about the future of fair use and copyright, or whether it’s a salvo in a negotiation.” – Jane Ginsburg, Professor, Columbia Law School
Looking Ahead at Generative AI Legal Issues
In determining when generative AI legal issues might be resolved is difficult. In fact, some suggest it could be more than a decade before a formal high-level ruling occurs. By the time generative AI and copyright protections are debated in lower courts, several years may pass. And to reach the U.S. Supreme Court would notably take longer. Given how rapidly generative AI developments are occurring, such a ruling may be of little use. Nonetheless, these are important issues because data is critical to AI training. And if massive data access is limited, then this could have a big impact on how the generative AI landscape unfolds.
(Monetizing AI is the toughest question to answer–read why in this Bold story.)
Understanding this, legal rulings that suggest generative AI and copyright violations exist could hinder competition. Limited access could give those with existing large data pools a major advantage in developing AI tools. Meta and Google, for example, would certainly benefit in this regard. At the same time, those with existing rights to use large data pools would also enjoy an advantage. Adobe and Bloomberg represent examples of companies in this category. However, in all likelihood, licensing rights agreements will be pursued should push come to shove. Current generative AI legal issues are concerning and are driving an increasing number of lawsuits. But ultimately, OpenAI and others aren’t going to stop just because a court ruling poses data barriers. Workarounds are always available, and the billions if not trillions of dollars at stake are enough to encourage creative solutions.