Spotify says it has identified account behind mass scraping claimed by pirate group

December 23, 2025 Spotify says it has identified the user account behind what it describes as “unlawful” scraping of its platform, responding to claims by Anna’s Archive that it has archived nearly all of the service’s music catalogue. The company confirmed that public metadata was scraped and that digital rights management (DRM) protections were bypassed to access some audio files, but stopped short of confirming the scale alleged by the group.

In an updated statement shared with Android Authority, Spotify said it had pinpointed the account used in the activity and reiterated that the scraping involved “illicit tactics” to circumvent DRM. The company did not say how much content was accessed or how long the activity lasted. It is also not stated whether legal action is being considered.

Anna’s Archive, which is known for backing up books and academic research, claimed in a blog post published late last month that it scraped metadata for 256 million tracks and audio files for 86 million songs from Spotify. The group says the archive represents about 99.6% of all listens on the platform and weighs just under 300 terabytes, making it “the largest publicly available music metadata database” in the world.

Spotify previously acknowledged that a third party accessed “some” audio files, but did not confirm whether the scraping reached anything close to the scale described by Anna’s Archive. At this point, it remains unclear how much of Spotify’s catalogue was affected, or whether the scraped data is still actively being distributed.

Anna’s Archive frames the project as a long-term preservation effort, arguing that large streaming platforms are a single point of failure for modern music history. While popular songs are widely backed up, the group says less-listened-to tracks could disappear if licensing agreements lapse or services shut down.

According to the group, most audio files in the archive originate directly from Spotify. Frequently played tracks are stored in their original 160 kbps format, while less popular songs have been re-encoded into smaller files to reduce storage requirements. The archive reportedly excludes music released after July 2025, and only metadata is currently fully accessible, with audio files being released gradually via torrents.

From a legal standpoint, the project sits on shaky ground. Spotify licenses music under strict agreements with record labels and rights holders, and mass extraction and redistribution of audio files violates both its terms of service and copyright law in many jurisdictions. Copyright law generally does not provide exemptions for preservation efforts carried out without permission.

Spotify has not said whether it will pursue takedown requests or legal action against the archive. Whether the material can be effectively removed from circulation once distributed through torrents remains a question.

Top Stories

Related Articles

February 25, 2026 The RAM shortage continues to squeeze PC buyers, with memory kits from major brands selling at sharply more...

February 25, 2026 Women and girls could face heightened risks of harassment and stalking if Meta proceeds with plans to more...

February 24, 2026 Graph databases have moved from an academic topic to the mainstream of information technology over the last more...

February 24, 2026 Linus Torvalds is marking the start of Linux 7.0 with equal parts routine engineering update and self-aware more...

Picture of Mary Dada

Mary Dada

Mary Dada is the associate editor for Tech Newsday, where she covers the latest innovations and happenings in the tech industry’s evolving landscape. Mary focuses on tech content writing from analyses of emerging digital trends to exploring the business side of innovation.
Picture of Mary Dada

Mary Dada

Mary Dada is the associate editor for Tech Newsday, where she covers the latest innovations and happenings in the tech industry’s evolving landscape. Mary focuses on tech content writing from analyses of emerging digital trends to exploring the business side of innovation.

Jim Love

Jim is an author and podcast host with over 40 years in technology.

Share:
Facebook
Twitter
LinkedIn