Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If your argument is that all of these things somehow combine to make the specific case I mentioned in my original comment legal (which was "stealing the work of every single artist, living and dead, for the sole purpose of making a profit", and I'll add replacing artists in the process to that), then I'm not seeing it.

You also seem to be talking about AI training more generally and not the specific case I singled out, which is important because this isn't a case of simply training a model on content obtained with consent - the material OpenAI and Stable Diffusion gathered was very explicitly without consent, and may have been done through outright piracy! (This came out in a case against Meta somewhat recently, but the exact origins of other company's datasets remain a mystery.)

Now I explained in another comment why I think current copyright laws should be able to clearly rule this specific case as copyright infringement, but I'm not arrogant enough to think I know better than copyright attorneys. If they say it falls under fair use, I'm going to trust them. I'm also going to say that the law needs to be updated because of it, and that brings us full circle to why I disagree with the article in the first place.



My argument is better framed that this:

> "stealing the work of every single artist, living and dead, for the sole purpose of making a profit"

is begging the question. The phrase "stealing" inherently presumes a crime has been committed when that is the very thing up for debate. The note that this is "for the sole purpose of making a profit" asks the reader to infer that one can not make a profit and also engage in fair use, and yet that is clearly not true. And that puts aside entirely that you're not referencing a "specific case" here, but generalizing to all forms of AI training.

We can start by examining the "stealing" phrasing, noting that up until fairly recently it was an axiom in a lot of "hacker" places that "copying is not theft". It feels somewhat contradictory to me to find that now that it's our work that's being copied, suddenly a lot of hackers are very keen on calling it theft.

We can note that a lot of our careers rest on the idea that copying is not prima facie a crime or theft. Certainly IBM wanted it to be so, and yet their BIOS was cloned and many of our careers were kickstarted (or at least kicked into high gear) by the consequences of that. Or how about the fact that despite what Apple would really like to be true, you can't stop people from copying your really cool idea from looking at it and making something similar. They couldn't win that battle against Microsoft, and they couldn't win it against Google or Samsung either.

We can talk about the fact that there is a lot of disagreement that the dead should have any IP rights at all. Notably again many "hacker" spaces were quite vocal about how the various "lifetime+" IP laws were actually detrimental to creativity and society. Until recently it was the Walt Disney corporation in their pursuit of making a profit that was one of the largest advocates for copyrights to be extended long after the creator was dead and gone.

We could also talk (again) about how many forms of fair use are pursued for "the sole purpose of making a profit". See the previous references to the IBM BIOS and the copying of Apple's GUIs. But there is also parody music like Weird Al, or Campbell v. Acuff-Rose Music. We can look also to the previously mentioned Connectix Virtual Game Station and other emulators or to the Oracle V. Google lawsuits. We can look at how TiVo's entire business model was to enable millions of people to create copies of copyrighted material every day, automatically. Or Kelly v. Arriba Soft (which is basically why you can do an image search today) and Suntrust Bank v. Houghton Mifflin (also relevant for the discussion re "the dead").

And again, most related to AI would be Authors Guild v. Google for the ingestion of thousands of copyrighted materials to in the pursuit of profit and creating the Google Scholar application. And application which I again note produces partial, but 100% faithful copies of that original work for the user and does so intentionally, something AI system actively try to avoid.

Which is why I started by asking you to walk the tech stack. If you think the law needs to be updated or you disagree with the current rulings on this matter, I can certainly understand that. So where do you propose drawing the line? What are the specific items that you feel distinguish training an AI from the various other fair use cases that have come before. Why is Google Scholar scanning in thousands of copyrighted materials with the express purpose of both making those materials searchable and displaying copies of that material ok, but an AI scanning in books with the express purpose of creating new material not ok? Why is Android, Windows, Gnome, KDE and other GUIs so very clearly copying the style (if not the works) of Apple ok, but Stable Diffusion producing new art works copying styles not?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: