As an example: I was doing a search for the best sesame substitute today. Everything that came up was things like, “11 Best Sesame Substitutes,” and I know for a fact that just about everything they suggested tastes nothing like sesame. Just another site trying to get hits. So I added reddit.com into my search parameters and immediately got some decent answers.
I really hate that I have to do that to get anything useful, but there is a ridiculous amount of useful information on Reddit. I hope the fediverse gets to this point as well one day.
Anway, just needed to vent. Lemmy on.
Hasn’t GPT eaten Reddit already? Genuinely asking.
Yes, every LLM ate reddit but LLMs aren’t aren’t reliable and tend to hallucinate .
On the other hand, one could train an / (ask a big enough) AI to extract useful info from each post, sort it in big categories (life style, science, mechanic,etc ) and subcategories (life tips, male clothe tips, chemistry, animal facts , car engine repair, bike engine repair ) then Do an internet search to check if there are other sources and use it to judge the reliability of the info and put it in a database that the LLM look up before answering. This condensed reddit could likely hold on a few gigs. Maybe there’s a better way to do it but this is the extent of my very limited knowledge.
Sounds like what Bing’s GPT4 Chatbot does
Bing does extract info from sites, including reddit and compare it to other sources but i doubt they’re creating a database. Imagine all the wisdom, knowledge of hundreds of thousands of people but offline and only without the useless arguing and other bullshit.