OpenAI and Microsoft being sued by The New York Times over Copilot and ChatGPT copyright infringement

A nighttime shot of The New York Times HQ. (Image credit: Shutterstock)

Joining the list of many other ongoing lawsuits relating to generative AI, The New York Times is now waging a battle in court against Microsoft and OpenAI due to their use of NYT content in Copilot and ChatGPT [h/t The Verge].

Specifically, The New York Times is seeking "billions of dollars in statutory and actual damages" from both AI providers for using "millions" of NYT articles by either directly quoting them or closely imitating them.

As The New York Times says in its lawsuit, "Through Microsoft's Bing Chat (recently rebranded as "Copilot") and OpenAI's ChatGPT, Defendants seek to free-ride on The Times' massive investment in its journalism by using it to build substitutive products without permission or payment."

These are some pretty scalding claims, and unfortunately for Microsoft and OpenAI, seemingly credible claims, since The New York Times provided examples of OpenAI copying Times content in its lawsuit. One way or another, the overwhelming majority of AI "training data" is other people's work, whether written, photographed, illustrated, etc.

This training data being taken and applied without care for publisher permission or author consent for profit certainly could satisfy the definition of copyright infringement, considering the nature of generative AI. With the evidence that NYT produced in this case, one can only hope that they get their money's worth, and this precedent helps cut down on the rampant copyright infringement present in the generative AI space.

While AI exploiting writers, artists, and other workers is a real and currently very under-litigated issue, there are still ways to fight the issue outside of the courtroom.

For example, some artists have started using a tool called "Nightshade" that poisons AI training models that scrape their work without permission.

As truly fascinating as advancements in artificial intelligence and PC hardware are, it's important to remember that generative AI can do nothing without being trained by real human beings. In the case of The New York Times, though, this training data was not given with permission, and if they secure their desired victory in this lawsuit, OpenAI and Microsoft will be paying billions of dollars for that mistake.

TOPICS

Christopher Harper has been a successful freelance tech writer specializing in PC hardware and gaming since 2015, and ghostwrote for various B2B clients in High School before that. Outside of work, Christopher is best known to friends and rivals as an active competitive player in various eSports (particularly fighting games and arena shooters) and a purveyor of music ranging from Jimi Hendrix to Killer Mike to the Sonic Adventure 2 soundtrack.

3 Comments Comment from the forums

vehekos

Overcoming Nightshade is trivial.
Just retrain the network with all images "Nightshaded"
Reply
Avro Arrow

This makes me laugh:
"Specifically, The New York Times is seeking "billions of dollars in statutory and actual damages" from both AI providers for using "millions" of NYT articles by either directly quoting them or closely imitating them."
Since all news outlets use similar wording to report the same events, there's no way of proving that anything came from the New York Times. If "closely imitating" articles was lawsuit-worthy, we'd have media outlets suing each other left and right.

There's not a chance in hell that NYT is going to win here.
Reply
Joseph_138

I'm wondering how ChatGPT managed to copy NYT content, when they keep everything behind a paywall. It's not like they can rip content without being logged into a paid account. You only get to see like 3 sentences of an article, before the login screen pops up and blocks evrything from being seen.
Reply