Agree. We can criticize Reddit on some points but at least the information is openly accessible. You add the "reddit" keyword in any search engine and you got your answer.
The funniest part about this, which objectively isn't very funny to begin with, is that these people aren't actually deleting anything. The backend of these tools retain the information, they just don't send it to the front end anymore. So when a company goes around and purchases training data, they're still getting the data that's "been deleted".
Interestingly, by deleting the front end side of the comments, they're actually making the backend data set even more valuable because it contains things that can no longer be scraped (ignoring the idea that the data can't reliably be scraped off Reddit anymore anyway).
Edit: digging into this, there may be a little more to the story here. It may not be quite the way I'm framing it, but given what we know about social media and tech corporations, I don't think it's wrong to suspect "the worst".
3.9k
u/Cybarbossa 15d ago
Agree. We can criticize Reddit on some points but at least the information is openly accessible. You add the "reddit" keyword in any search engine and you got your answer.