Nvidia’s ISP piracy defense backfires as judge refuses to dismiss copyright lawsuit over more than 197,000 pirated books — scripts in NeMo Framework allegedly ‘have no other purpose’ than to speed up infringement

Nvidia logo
(Image credit: Getty / Bloomberg)

U.S. District Judge Jon Tigar has denied Nvidia's request to dismiss a copyright infringement case filed against it, arguing that it’s not liable for how clients use its AI-powered NeMo Megatron Framework. According to TorrentFreak, Nvidia is asking the court to dismiss the direct copyright infringement claims that are connected to its use of the Bibliotik eBook torrent tracker, the Books3 dataset, and 'The Pile' dataset for language modeling. Nvidia then cited the Cox vs. Sony ruling, where the U.S. Supreme Court ruled that a service provider is not liable for any piracy that its users might carry out.

Nvidia said that its NeMo Megatron Framework has significant “non-infringing uses” and that it did not promote it as a piracy tool. This should fall under Justice Clarence Thomas’ decision saying, “Under our precedents, a company is not liable as a copyright infringer for merely providing a service to the general public with knowledge that it will used by some to infringe copyrights.” Unfortunately for the company, Judge Tigar disagreed with its argument, saying that it’s not the framework, but specific scripts within it that violated copyright rules.

He said that these were intended to make it easier for users to automatically download and preprocess The Pile dataset, which the complainants said allegedly contained copyrighted work. “The scripts are alleged to have no other purpose than to speed up the process of infringement, unlike the digital video recorder systems at issue in Sony Corp. or the internet service provided in Cox,” Judge Tigar wrote. Bibliotik is a private eBook torrent tracker, which allegedly contains over 197,000 books. It was then included in the Books3 dataset, which itself was included in the 800+ gigabyte The Pile dataset. The Pile was then used for training Nvidia’s AI LLMs, resulting in several authors filing a class action lawsuit against the company for copyright infringement.

Latest Videos From

There have been previous cases of copyright infringement related to AI companies scraping data for training their models. Aside from this case against Nvidia, Meta has also been facing a similar lawsuit since last year. It even defended itself by saying that using pirated material is legal if you don’t seed content. Google has even been pushing to have AI scraping tagged as fair use, saying that it wants “copyright systems that enable appropriate and fair use of copyrighted content to enable the training of AI models in Australia on a broad and diverse range of data while supporting workable opt-outs for entities that prefer their data not to be trained in using AI systems.”

With this decision, the authors’ class action against Nvidia is set to move forward, and we will likely hear more details as the case progresses. We don’t have a date yet for when the next hearing will be, though. Still, we expect this to be a multi-year battle as the AI giant battles it out with allegedly infringed writers.

Google Preferred Source

Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

TOPICS
Jowi Morales
Contributing Writer

Jowi Morales is a tech enthusiast with years of experience working in the industry. He’s been writing with several tech publications since 2021, where he’s been interested in tech hardware and consumer electronics.

  • GenericUser2001
    So dumb of Nvidia to do that sort of thing; should have done what Google did (when Google was creating its Google Books) and check books out of libraries to scan them. Or just bought a copy of each book. 197k x say $20 a book is less than $4 million, peanuts for Nvidia.
    Reply
  • DRagor
    Normal person copies a copyrighted data for personal use: It's a Piracy!
    Meanwhile big company copies thousands of those for commercial use: it's a fair use.

    More and more we are living in alternate reality world of Monty Python's 'tis a scratch would'.
    Reply
  • das_stig
    Oh please, let the judge say they are pirates, would love to see the big boys get sued for billions for deliberate copyright infringement with intent of enrichment.
    Reply
  • bigdragon
    Nvidia then cited the Cox vs. Sony ruling, where the U.S. Supreme Court ruled that a service provider is not liable for any piracy that its users might carry out.
    Is Nvidia using AI lawyers or something? What a bizarre defense! Cox vs Sony has the users as the infringing party. I interpret this situation as Books3 vs Authors where Nvidia could be seen as the infringing user. Nvidia gets their received stolen property taken away at best under this interpretation, and at worst they get held liable for each and every instance of infringing token. I know Nvidia is trying to say that they're a middleman like Cox, but I think the torrent providers already play that role. I hope Nvidia -- and the rest of the AI industry -- gets a huge reality check from copyright holders via this lawsuit.
    Reply
  • Shiznizzle
    This is getting ridiculous now. If a private citizen wants to make money by illegally "scraping", as they call it, content on the net, they are thrown in jail. They do not get to ask for exemptions to the law based on the fact that they believe downloading content ,that is copyright protected, is legal because they chose not to seed it.

    When Meta dos this, its ok though. Their case was throw out as i recall. Meaning that some of their claims were upheld. I think the authorities in charge are appealing this.

    I am no lawyer but what sort of mental gymnastics do you have to perform to twist the illegal possession and downloading of copyright material into a legal activity? Simply because you did not share this material.

    Nvida wants exemptions to the copyright law based on the fact that they are not responsible for their users' behavior on their networks?

    Thats ok then. We will just look the other way when Nvidia then directs their own employees to do this on the ISP network then. All good now. ISP at fault.

    Of course google wants the opt-out system as you are opted in by default. The largest and most dangerous data hoovering company in the world wants yet more data to feed their AI with.

    These companies are after profits mind you. Money. And to get there they are twisting minds so they can do illegal things the rest of us cant
    Reply
  • PEnns
    Shiznizzle said:
    This is getting ridiculous now. If a private citizen wants to make money by illegally "scraping", as they call it, content on the net, they are thrown in jail. They do not get to ask for exemptions to the law based on the fact that they believe downloading content ,that is copyright protected, is legal because they chose not to seed it.

    When Meta dos this, its ok though. Their case was throw out as i recall. Meaning that some of their claims were upheld. I think the authorities in charge are appealing this.

    I am no lawyer but what sort of mental gymnastics do you have to perform to twist the illegal possession and downloading of copyright material into a legal activity? Simply because you did not share this material.

    Nvida wants exemptions to the copyright law based on the fact that they are not responsible for their users' behavior on their networks?

    Thats ok then. We will just look the other way when Nvidia then directs their own employees to do this on the ISP network then. All good now. ISP at fault.

    Of course google wants the opt-out system as you are opted in by default. The largest and most dangerous data hoovering company in the world wants yet more data to feed their AI with.

    These companies are after profits mind you. Money. And to get there they are twisting minds so they can do illegal things the rest of us cant

    Are you really implying that there are different laws for the rich and powerful as compared to those for private citizens??

    Because if you are, you'd be100% correct!!
    Reply
  • Concerned Liberty
    On the one hand I think these AI companies should have to pay some kind of license to use copyrighted works. On the other hand I think copyrights should last at maximum 20 years and then those works should fall into the Public Domain.

    Trademarks should last for as long as a Brand is actively using it, but cannot be used to prevent copyrighted materials from falling into the Public Domain. (Looking at Disney with that one, in particular.)
    Reply
  • MobileJAD
    I want to go back in time to when Nvidia was all about making really cool GPU's for gaming at reasonable prices and wasn't weirdly obsessed with AI...
    Reply
  • Pierce2623
    Shiznizzle said:
    This is getting ridiculous now. If a private citizen wants to make money by illegally "scraping", as they call it, content on the net, they are thrown in jail. They do not get to ask for exemptions to the law based on the fact that they believe downloading content ,that is copyright protected, is legal because they chose not to seed it.

    When Meta dos this, its ok though. Their case was throw out as i recall. Meaning that some of their claims were upheld. I think the authorities in charge are appealing this.

    I am no lawyer but what sort of mental gymnastics do you have to perform to twist the illegal possession and downloading of copyright material into a legal activity? Simply because you did not share this material.

    Nvida wants exemptions to the copyright law based on the fact that they are not responsible for their users' behavior on their networks?

    Thats ok then. We will just look the other way when Nvidia then directs their own employees to do this on the ISP network then. All good now. ISP at fault.

    Of course google wants the opt-out system as you are opted in by default. The largest and most dangerous data hoovering company in the world wants yet more data to feed their AI with.

    These companies are after profits mind you. Money. And to get there they are twisting minds so they can do illegal things the rest of us cant
    Meta’s case was not thrown out. Its still ongoing.
    Reply
  • Air2004
    DRagor said:
    Normal person copies a copyrighted data for personal use: It's a Piracy!
    Meanwhile big company copies thousands of those for commercial use: it's a fair use.

    More and more we are living in alternate reality world of Monty Python's 'tis a scratch would'.
    Wrong. It's only piracy if you profit from it. Otherwise its maybe copyright infringement at best.
    Reply