Maybe I’m missing something, but I’m confused how they can promise “high speed access” to the data while also claiming:
We do not host any copyrighted materials here. We are a search engine, and as such only index metadata that is already publicly available. When downloading from these external sources, we would suggest to check the laws in your jurisdiction with respect to what is allowed. We are not responsible for content hosted by others.
Do they have the data or do they not have it?
They also claim to be able to do things like extract text and deduplicate the data… That seems to suggest a significant amount of storage and compute power for a non-profit that has only been around for ~3 years.
I find this entire thing fishy as fuck. Call me a conspiracy theorist, but I’m not convinced that the entire existence of this data theft operation isn’t simply to be a illicit data broker for AI companies. And now their is direct evidence tying both Anthropic and NVidia to them.
Maybe I’m missing something, but I’m confused how they can promise “high speed access” to the data while also claiming:
Do they have the data or do they not have it?
They also claim to be able to do things like extract text and deduplicate the data… That seems to suggest a significant amount of storage and compute power for a non-profit that has only been around for ~3 years.
I find this entire thing fishy as fuck. Call me a conspiracy theorist, but I’m not convinced that the entire existence of this data theft operation isn’t simply to be a illicit data broker for AI companies. And now their is direct evidence tying both Anthropic and NVidia to them.