Congress Wants Tech Companies to Pay Up for AI Training Data

stopthatgirl7@kbin.social · 9 months ago

Congress Wants Tech Companies to Pay Up for AI Training Data

General_Effort@lemmy.world · 9 months ago

These open datasets are used to fine-tune LLMs for specific tasks. But first, LLMS have to learn the basics by being trained on vast amounts of text. At present, there is no chance to do that with open source.

If fair use is cut down, you can forget about it. It would arguably be unconstitutional, though.

That’s not even considering the dystopian wishes to expand copyright even further. Some people demand that the model owner should also own the output. Well, some of these open datasets are made with LLMs like ChatGPT.