News publishers are blocking the Internet Archive’s Wayback Machine
Airfind news item
By Ana-Maria Stanciuc
Published on May 1, 2026.
The Internet Archive's Wayback Machine is being targeted by news publishers, including The New York Times, CNN, USA Today, The Guardian, and at least 241 news organisations across nine countries, due to AI companies using archived news content to train models without permission or payment. According to an analysis by AI-detection startup Originality AI, 23 major news publications are blocking ia_archiverbot, the main web crawler the Internet Archive uses for the Wayback machine. The Guardian has been particularly cautious about limiting the Archive's access after its logs revealed the Archive was a frequent crawler. The Internet Archive has taken steps to limit bulk downloads and prevent bulk downloading of certain sites’ material, and maintains controls to limit large-scale automated extraction. However, the issue of third data access to the Archive does not fully resolve publishers' concerns that third data parties can access the Archive’s crawlers.
Read Original Article