• DrCake@lemmy.world
    link
    fedilink
    English
    arrow-up
    325
    arrow-down
    5
    ·
    14 days ago

    So when’s the ruling against OpenAI and the like using the same copyrighted material to train their models

    • irotsoma@lemmy.world
      link
      fedilink
      English
      arrow-up
      125
      arrow-down
      3
      ·
      14 days ago

      But OpenAI not being allowed to use the content for free means they are being prevented from making a profit, whereas the Internet Archive is giving away the stuff for free and taking away the right of the authors to profit. /s

      Disclaimer: this is the argument that OpenAI is using currently, not my opinion.

    • norimee@lemmy.world
      link
      fedilink
      English
      arrow-up
      80
      arrow-down
      3
      ·
      edit-2
      13 days ago

      Ah, I see you got that all wrong.

      Open IA AI uses that content to generate billions in profit on the backs of The People. The Internet Archive just does it for the good of The People.

      We can’t have that. “Good for The People” is not how the economy works, pal. We need profit and exploitation for the world to work…

      • shrugs@lemmy.world
        link
        fedilink
        English
        arrow-up
        19
        arrow-down
        1
        ·
        14 days ago

        So, let’s say we create an llm that will be fed will all the copyrighted data and we design it, so that it recalls the originals when asked?! Does that count as piracy or as the kind of legal shananigans openai is doing?

    • PriorityMotif@lemmy.world
      link
      fedilink
      English
      arrow-up
      20
      arrow-down
      14
      ·
      14 days ago

      It’s two different things happening. One is redistribution, which isn’t allowed and the other is fair use, which is allowed. You can’t ban someone from writing a detailed synopsis of your book. That’s all an llm is doing. It’s no different than a human reading the material and then using that to write something similar.

      • xthexder@l.sw0.com
        link
        fedilink
        English
        arrow-up
        17
        arrow-down
        1
        ·
        edit-2
        14 days ago

        the other is fair use

        That’s very much up for debate still.

        (I am personally still undecided)

        • PriorityMotif@lemmy.world
          link
          fedilink
          English
          arrow-up
          9
          arrow-down
          5
          ·
          14 days ago

          The difference is that the llm has the ability to consume and remember all available information whereas a human would have difficulty remembering everything in detail. We still see humans unintentionally remaking things they’ve heard before. Comedians have unintentionally stolen jokes they’ve heard. Every songwriter has unintentionally “discovered” a catchy tune which is actually someone else’s. We have fanfiction and parody. Most people’s personalities are just an amalgamation of everyone and everything they’ve ever seen, not unlike an llm themselves.

          • xthexder@l.sw0.com
            link
            fedilink
            English
            arrow-up
            5
            ·
            14 days ago

            I agree with you for the most part, but when the “person” in charge of the LLM is a big corporation, it just exaggerates many of the issues we have with current copyright law. All the current lawsuits going around signal to me that society as a whole is not so happy with how it’s being used, regardless of how it fits in to current law.

            AI is causing humanity to have to answer a lot of questions most people have been ignoring since the dawn of philosophy. Personally I find it rather concerning how blurry some lines are getting, and I’ve already had to reevaluate how I think about certain things, like what moral responsibilities we’ll have when AIs truely start to become sentient. Is turning them off and deleting them a form of murder? Maybe…

            • trafficnab@lemmy.ca
              link
              fedilink
              English
              arrow-up
              5
              arrow-down
              2
              ·
              14 days ago

              OpenAI losing their case is how we ensure that the only people who can legally be in charge of an LLM are massive corporations with enough money to license sufficient source material for training, so I’m forced to begrudgingly take their side here

            • greenskye@lemm.ee
              link
              fedilink
              English
              arrow-up
              2
              ·
              14 days ago

              Agreed. I keep waffling on my feelings about it. It definitely doesn’t feel like our laws properly handle the scale that LLMs can take advantage of ‘fair use’. It also feels like yet another way to centralize and consolidate wealth, this time not money, but rather art and literary wealth in the hands of a few.

              I already see artists that used to get commissions now replaced by endless AI pictures generated via a Lora specifically aping their style. If it was a human copying you, they’d still be limited by the amount they could produce. But an AI can spit out millions of images all in the style you perfected. Which feels wrong.

      • Gsus4@mander.xyz
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        2
        ·
        edit-2
        13 days ago

        The matter is not LLMs reproducing what they have learned, it is that they didn’t pay for the books they read, like people are supposed to do legally.

        This is not about free use, this is about free access, which at the scale of an individual reading books is marketed as “piracy”…at the scale of reading all books known to man…it’s onmipiracy?

        We need some kind of deal where commercial LLMs have to pay a rent to a fund that distributes that among creators or remain nonprofit, which is never gonnna happen, because it’ll be a bummer for all the grifters rushing into that industry.

        • PriorityMotif@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          14 days ago

          I think we need to re-examine what copyright should be. There’s nothing inherently immoral about “piracy” when the original creator gets almost nothing for their work after the initial release.

        • barsoap@lemm.ee
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          14 days ago

          it is that they didn’t pay for the books they read, like people are supposed to do legally.

          If I can read a book from a library, why shouldn’t OpenAI or anybody else?

          …but yes from what I’ve heard they (or whoever, don’t remember) actually trained on libgen. OpenAI can be scummy without the general process of feeding AI books you only have read access to being scummy.