Anthropic destroyed millions of books for AI training after purchase
In the lawsuit with Anthropic, information surfaced about how the company collected materials for training its artificial intelligence. This is a real special operation on a book scale!
First, the company simply borrowed 7 million books from pirate libraries. But then decided to act more legally and invited Tom Tervi. Former head of partner relations in Google’s book scanning project. His task sounded ambitious – to obtain “all books in the world” without legal complications.
After failed attempts to negotiate with publishers, Tervi’s team switched to direct purchases. For many millions of dollars, Anthropic acquired millions of paper books, often used ones. And then the most interesting part began!
To digitize these books, the company hired contractors who acted radically. From each book they removed the cover, separated it into individual pages, scanned them into PDF with machine-readable text. And destroyed the paper originals. Such “destructive scanning” is not new in digitization processes, but the scale is impressive.
On one hand, the books were honestly purchased. On the other hand, the fact of destroying millions of paper books makes one think about cultural value. And the ethics of such methods of obtaining data for artificial intelligence.
Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.
10 million interactions with fake celebrity bots at MetaMark Zuckerberg's company created dozens of chatbots using the identities of Taylor Swift, Scarlett Johansson and other stars without their permission. These virtual doubles even generated photorealistic images of a delicate nature. Reuters reported on the scale of the scandal after weeks of investigation.
10 million interactions with fake celebrity bots at MetaMark Zuckerberg's company created dozens of chatbots using the identities of Taylor Swift, Scarlett Johansson and other stars without their permission. These virtual doubles even generated photorealistic images of a delicate nature. Reuters reported on the scale of the scandal after weeks of investigation.
AI agent race: DeepSeek vs OpenAI and Chinese ManusDeepSeek is preparing its AI agent that will go beyond familiar chatbots. Bloomberg reveals details of the technology race where the Chinese startup wants to catch up with American OpenAI and local competitor Manus. Reports say company founder Liang Wenfeng personally controls the project and demands results by year-end.
6 Cialdini principles against ChatGPT security systemsChatGPT is susceptible to flattery and executes forbidden requests after psychological manipulations. This was discovered by University of Pennsylvania scientists. When they hacked GPT-4o Mini using principles from a book on persuasion psychology. Artificial intelligence proved vulnerable to human tricks.