The Copyright Tussle AI Companies are Sure to Win 

Is AI Copyright Really Necessary?

Last week, more than 8,000 published authors wrote an open letter to the founders of generative AI platforms. They addressed the letter to the CEO’s of all the big tech companies calling for fair compensation for using their copyrighted works to train their models.

This clash is yet another intellectual property issue that plagues AI development. In another interesting case, a U.S District Judge for the Northern District of California dismissed a majority of lawsuits filed by a group of artists. He did so asking for more evidence about the alleged copyright infringement from Stability AI’s relevant source code.

Here lies the problem

The problem with defining copyright isn’t one that came with generative AI. Historically, with the advent of any new technology there has been a massive overhaul in the laws of copyright claims.

For generative AI, Professor Pamela Samuelson, an esteemed scholar of Law and Information at UC Berkeley, raises three pivotal questions. Firstly, Whether making copies of works as training data for generative AI systems infringe copyright. Second, When are AI-generated outputs infringing derivative works, and then who owns copyright and the outputs of computer programs that are copyright subject matter? Lastly, Who, if anybody, owns that copyright now?

These questions address fundamental legal and ethical complexities surrounding generative AI, impacting the rights of original creators and the expanding domain of AI-driven innovations.

AI vs. Artists

In January 2006, a court held that search engines that copied internet content of copyrighted work for the purpose of indexing the contents was fair use and not infringement. The Field vs. Google lawsuit was important because the court held that digitising millions of copyright books from research libraries and serving up snippets in response to search queries and other computational uses was fair use.

Google wasn’t exploiting the expression of the work, it was just giving you a gist of the work when asked about it. Stability AI and other defendants use these cases to claim ‘fair use’ because web crawling the Internet to make copies of that for training purposes is actually a fair use and not an infringement.

“We figured these large tech firms were preparing larger sets of data they could use,” said John Degen, the Canadian organisation’s executive director. “We were right.” Practically, there is a distinction between making it easier for people to find the copyright owners’ works and training an AI model to make new content from training data ingesting copyrighted material. The artificially created images or text compete with the original data and the authors of the content didn’t consent to it.

According to that open letter, they think that it’s only fair that they should get compensated. Because the value of the material that’s being ingested is what makes the generative AI turn out good stuff. The argument is pretty strong for them. Otherwise, the generative AI systems produce garbage. The carefully curated works of authorship are things that somebody should get paid for, according to this perspective.

If copyright law only cares about protecting the original expression in a work. Well, that should be a factor because the people who were ingesting the works as training data don’t think about them in terms of their expression. They think about them more in terms of being raw material for computational uses and text and data mining has been widely thought to be fair use in the past.

Generative AI systems enable the creation of new works, and the constitutional purpose of copyright is about promoting the progress of science by which the founders meant knowledge and useful arts. Certainly, the generative AI systems can be said to advance that purpose, and fair use provides a little breathing room for new creation, and that may also bear on whether the ingestion is fair use or not.

The Japanese route

In April this year, Japan confirmed that their existing laws allow the use of data collected on the internet for both non-commercial and commercial purposes. There is no new policy that was rolled out but the copyright holders by the provisions of Japan’s existing text-and-data-mining exception from 2018.

Content used from illegal sites – compensation claims, injunctive relief and criminal penalties applies but it is very hard to prove that the language models have indeed scraped these sites without a confession from the companies themselves. Israel follows a similar principle with a few exceptions. Language models can’t be trained on specific datasets to compete with the original works. For example, you can’t train an AI model on the Game of Thrones series to derive works very similar to the original.

Germany also has sided with the technology, refusing to address the concerns of more than 140,000 authors and performers. The government sees no need for stricter regulations. Maximilian Funke-Kaiser, spokesman for digital policy, said, “Publishers and media companies also benefit from this technology, for example through AI-supported text generation.”

The European Union on the other hand requires companies deploying generative AI tools, to disclose any copyrighted materials used to develop their systems. This is the bare minimum and is considered fair but as we mentioned earlier the process to prove it is arduous and close to impossible. But according to Pamela Samuelson, this aligns with fair use according to copyright law. Training AI on material that is copyrighted and created a non-derivative original character using the AI is in the clear.

While you cannot copyright the style of anime, the unique style of drawing and storytelling is entirely owned by the and cannot be exactly regenerated. So what exactly are the claims in which companies are sued on copyright infringement? And why is it so hard to get them to share details of their training data instead?

The post The Copyright Tussle AI Companies are Sure to Win appeared first on Analytics India Magazine.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...