- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
The New York Times sues OpenAI and Microsoft for copyright infringement::The New York Times has sued OpenAI and Microsoft for copyright infringement, alleging that the companies’ artificial intelligence technology illegally copied millions of Times articles to train ChatGPT and other services to provide people with information – technology that now competes with the Times.
Not the original comment but I think the difference you’re looking for is in the copying and distribution. The OC makes the false assumption that the data set is full copies of every object fed into it rather than sets of common characteristics.
For example, my own mind has a concept tree. Tree is not a copy of every tree I’ve ever known but more like lists of common characteristics that define treeness based on information I’ve gathered about treeness (my data set).
Piracy is piracy not because of how it’s consumed, but rather, how it’s distributed and stored, as full copies of the object. Datasets are not copies, in other words. And thus copyright doesn’t apply.
Reading an article to get an idea about what articleness is, is fair use. Reading an article to reproduce it verbatim is not. And as of now, I don’t believe LLMs are doing the later.