Dear Commons Community,
A number of lawsuits in a New York federal court will test the future of ChatGPT and other artificial intelligence products to access the huge troves of copyrighted human works.
The question will come down to whether AI chatbots — such as those developed by OpenAI and its business partner Microsoft — are breaking copyright and fair competition laws? As reported by The Associated Press and Financial Times.
“I would like to be optimistic on behalf of the authors, but I’m not. I just think they have an uphill battle here,” said copyright attorney Ashima Aggarwal, who used to work for academic publishing giant John Wiley & Sons.
One lawsuit comes from The New York Times. Another from a group of well-known novelists such as John Grisham, Jodi Picoult and George R.R. Martin. A third from bestselling nonfiction writers, including an author of the Pulitzer Prize-winning biography on which the hit movie “Oppenheimer” was based.
THE LAWSUITS
Each of the lawsuits makes different allegations, but they all center on the San Francisco-based company OpenAI “building this product on the back of other peoples’ intellectual property,” said attorney Justin Nelson, who is representing the nonfiction writers and whose law firm is also representing The Times.
“What OpenAI is saying is that they have a free ride to take anybody else’s intellectual property really since the dawn of time, as long as it’s been on the internet,” Nelson said.
The Times sued in December, arguing that ChatGPT and Microsoft’s Copilot are competing with the same outlets they are trained on and diverting web traffic away from the newspaper and other copyright holders who depend on advertising revenue generated from their sites to keep producing their journalism. It also provided evidence of the chatbots spitting out Times articles word-for-word. At other times the chatbots falsely attributed misinformation to the paper in a way it said damaged its reputation.
One senior federal judge is so far presiding over all three cases, as well as a fourth from two more nonfiction authors who filed another lawsuit last week. U.S. District Judge Sidney H. Stein has been at the Manhattan-based court since 1995 when he was nominated by then-President Bill Clinton.
THE RESPONSE
OpenAI and Microsoft haven’t yet filed formal counter-arguments on the New York cases, but OpenAI made a public statement this week describing The Times lawsuit as “without merit” and saying that the chatbot’s ability to regurgitate some articles verbatim was a “rare bug.”
The examples put forward by the Times are from old articles that have been published on a number of third-party sites, according to OpenAI. “It seems [the Times] intentionally manipulated prompts, often including lengthy excerpts of articles, in order to get our model to regurgitate.” “Our models don’t typically behave the way The New York Times insinuates, which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts,” OpenAI wrote. “The blog concedes that OpenAI used The Times’s work, along with the work of many others, to build ChatGPT,” said Ian Crosby, a partner at Susman Godfrey, which is representing The New York Times. OpenAI had sought “to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment,” according to the company’s lawsuit. “That’s not fair use by any measure,” Crosby added. OpenAI and other AI companies have argued that processing reams of publicly available data from the internet constitutes protected “fair use” under US copyright law.
WHO’S GOING TO WIN?
Much of the AI industry’s argument rests on the “fair use” doctrine of U.S. copyright law that allows for limited uses of copyrighted materials such as for teaching, research or transforming the copyrighted work into something different.
In response, the legal team representing The Times wrote Tuesday that what OpenAI and Microsoft are doing is “not fair use by any measure” because they’re taking from the newspaper’s investment in its journalism “to build substitutive products without permission or payment.”
So far, courts have largely sided with tech companies in interpreting how copyright laws should treat AI systems. In a defeat for visual artists, a federal judge in San Francisco last year dismissed much of the first big lawsuit against AI image-generators. Another California judge shot down comedian Sarah Silverman’s arguments that Facebook parent Meta infringed on the text of her memoir to build its AI model.
The most recent lawsuits have brought more detailed evidence of alleged harms, but Aggarwal said when it comes to using copyrighted content to train AI systems that deliver a “small portion of that to users, the courts just don’t seem inclined to find that to be copyright infringement.”
Tech companies cite as precedent Google’s success in beating back legal challenges to its online book library. The U.S. Supreme Court in 2016 let stand lower court rulings that rejected authors’ claim that Google’s digitizing of millions of books and showing snippets of them to the public amounted to copyright infringement.
But judges interpret fair use arguments on a case-by-case basis and it is “actually very fact-dependent,” depending on economic impact and other factors, said Cathy Wolfe, an executive at the Dutch firm Wolters Kluwer who also sits on the board of the Copyright Clearance Center, which helps negotiate print and digital media licenses in the U.S.
“Just because something is free on the internet, on a website, doesn’t mean you can copy it and email it, let alone use it to conduct commercial business,” Wolfe said. “Who’s going to win, I don’t know, but I’m certainly a proponent for protecting copyright for all of us. It drives innovation.”
BEYOND THE COURTS
Some media outlets and other content creators are looking beyond the courts and calling for lawmakers or the U.S. Copyright Office to strengthen copyright protections for the AI era. A panel of the U.S. Senate Judiciary Committee will hear testimony Wednesday from media executives and advocates in a hearing dedicated to AI’s effect on journalism.
Roger Lynch, chief executive of the Conde Nast magazine chain, plans to tell senators that generative AI companies “are using our stolen intellectual property to build tools of replacement.”
“We believe that a legislative fix can be simple — clarifying that the use of copyrighted content in conjunction with commercial Gen AI is not fair use and requires a license,” says a copy of Lynch’s prepared remarks.
This lawsuit will be followed closely in the AI and publishing worlds. It will likely lead to lengthy deliberations.
Tony