Reddit caught Perplexity stealing content
I told earlier that Reddit filed a lawsuit against AI search engine Perplexity. Reddit accuses Perplexity of “industrial” content scraping. But now there are facts and Reddit showed how they caught the defendant in a trap.
To support the accusations, Reddit conducted an experiment. The company created a test post that was accessible only to Google’s crawler and was not visible to regular users and external data collectors. According to Reddit, the content of this post appeared in Perplexity answers within just a few hours. For Reddit this is direct proof that Perplexity through third-party parsers of Google search results obtains data closed to it. In the lawsuit text this technique is compared to marked bills that investigators use.
Reddit claims they sent Perplexity a letter demanding to stop access to content back in May 2024. But since then the number of Perplexity links to Reddit materials “increased 40 times”. According to Reddit’s version, Perplexity and contractors deliberately bypassed technical barriers to pull texts from the platform.
Perplexity publicly rejects the accusations. The company says it doesn’t train its base models directly on Reddit data. But only “summarizes publicly available discussions”.