Data hoarders race to preserve data from rapidly disappearing U.S. federal websites

White House 404 page
(Image credit: The White House)

U.S. President Donald Trump has issued an executive order that has resulted in many government agencies taking down webpages and sites to comply. Because of this, data hoarders across the internet are racing to preserve them all before they’re taken offline, with MuckRock reporting that the End of Term Archive, which includes the Internet Archive, Stanford University, Common Crawl Foundation, University of North Texas, and Webrecorder, having already saved more than 500 terabytes from .gov domains.

It's reported that more than 8,000 government pages have been taken down, including the Department of Justice database detailing the criminal charges and convictions of January 6 rioters, LGBTQ+ rights and HIV-related information from the Centers for Disease Control and Prevention, and the Climate and Economic Justice Screening Tool released by the Council on Environmental Quality, among others.

Because of this, the r/DataHoarder Subreddit is rallying its over 832,000 members to help save the data in danger of being taken offline and deleted. u/didyousayboop shared on the Subreddit that the Archive Team, composed of volunteer digital archivists led by Jason Scott — the Free Range Archivist and Software Curator at the Internet Archive — is asking for help with its US Government project. This effort is focused on archiving all government content, especially data that is at risk of being removed because of the current administration’s efforts.

We’ve also seen several threads in the r/DataHoarder asking for help backing up specific pages and websites. These include NOAA, USAID, the National Center for Education Statistics, the National HIV Curriculum, CDC Immunization Publications, and more. Someone was even asking for help downloading USAID’s videos on its YouTube channels, fearing that they would be next, after the USAID website went down.

Aside from requests to backup data and volunteers acting on them, we’re also seeing others volunteering to host the archived site data for free on their domains.

This is one of the biggest efforts we’ve seen in archiving, where a huge collection of storage geeks is putting out their best effort to download and preserve online historical data. If you want to join them and help save the information hosted on government servers, you can check out the instructions u/didyousayboop left on r/DataHoarder.

Jowi Morales
Contributing Writer

Jowi Morales is a tech enthusiast with years of experience working in the industry. He’s been writing with several tech publications since 2021, where he’s been interested in tech hardware and consumer electronics.

Read more
US Capitol Building
Significant U.S. Treasury cybersecurity breach is the latest in string of China hack attacks claims U.S. officials
China USA
Chinese-made DeepSeek AI model records extensive online user data, stores it in China-based servers
Seagate Exos X20 20TB hard drive
Seagate hard drive controversy persists as scammers discover methods to alter reliability metrics
WILLIAM_POTTER
Trump purge hits Chips Act office, two-fifths of staff to be terminated: Report
Seagate's Exos M HAMR-based hard drive.
Fraudulent hard drive scandal deepens at Seagate: Clues point at Chinese Chia mining farms
Old laptops
Linux or Landfill? End of Windows 10 Leaves PC Charities with Tough Choice
Latest in Big Tech
Nvidia
Nvidia gaming GPUs an afterthought as AI generates mountains of cash — RTX 50-series shortages mentioned, not explained
Apple CEO Tim Cook outside the 5th Avenue Apple Store in NYC.
Apple says it will spend $500 billion in US over next four years as it faces down Trump tariffs
White House 404 page
Data hoarders race to preserve data from rapidly disappearing U.S. federal websites
Pat Gelsinger
Intel CEO Pat Gelsinger visits Elon Musk’s Memphis data center, touts Xeon deployment — praises xAI team for building it “in such a short amount of time”
mini-Jensen Huang Halloween costume
'Jensanity' continues in Taiwan, Mini-Jensen Huang cosplayer sports massive Nvidia GPU for Halloween
Elon Musk talking to another person with guard in the background
Server dealer keeps hitting at Elon Musk for $61 million bill — Wiwynn sues X for unpaid IT infrastructure products
Latest in News
RX 9070 XT Sapphire
Lisa Su says Radeon RX 9070-series GPU sales are 10X higher than its predecessors — for the first week of availability
RTX 5070, RX 9070 XT, Arc B580
Real-world GPU prices cost up to twice the MSRP — a look at current FPS per dollar values
Zotac Gaming GeForce RTX 5090 AMP Extreme Infinity
Zotac raises RTX 5090 prices by 20% and seemingly eliminates MSRP models
ASRock fixes AM5 motherboard by cleaning it
ASRock claims to fix 'burned out' AM5 motherboard by cleaning the socket
ChatGPT Security
Some ChatGPT users are addicted and will suffer withdrawal symptoms if cut off, say researchers
project-g-assist-nvidia-geforce-rtx-ogimage
Nvidia releases public G-Assist in latest App to provide in-game AI assistance — also introduces DLSS custom scaling factors
  • Heiro78
    It's a bad day when what was free information is being stripped. I've probably never visited any of these sites but it sucks to see them going since I'm sure others utilize them regularly.
    Reply
  • Gururu
    1) Delete history
    2) Rewrite history
    Reply
  • bit_user
    One can also donate to archive.org
    Reply
  • _dawn_chorus_
    "Under his eye"
    Reply
  • dimar
    Reminds me when Intel took down all motherboard BIOS files from their site, I took my time to save all of it.
    Reply
  • 3tank
    Seemed ok when orgs like the internet archive were deleting items on their own end that were inconvenient to the last party in charge
    Reply
  • bit_user
    3tank said:
    Seemed ok when orgs like the internet archive were deleting items on their own end that were inconvenient to the last party in charge
    Deleting something for political reasons goes against the ethos of a true archivist. I don't believe it was Internet Archive that you're thinking of.
    Reply
  • helper800
    "Book burnings. Always the forerunners. Heralds of the stake, the ovens, the mass graves."
    -Geraldine Brooks
    Reply
  • Heiro78
    3tank said:
    Seemed ok when orgs like the internet archive were deleting items on their own end that were inconvenient to the last party in charge
    I only spent around 5 minutes looking, but this is the only related article I could find to the Internet Archive deleting data.

    https://blog.gingerbeardman.com/2024/08/01/psa-internet-archive-glitch-deletes-years-of-user-data-and-accounts/
    Can you clarify what you mean or point to a news article about it?
    Reply
  • thestryker
    The only good thing is that everyone saw this coming from a ways away. The extra bad part is that it's far more widespread than the last time.

    This is one of those things where you'd like to see the preservation of data codified in law. Not to say that the websites need not change as administrations come and go, but simply that any public data remain accessible in some form.
    Reply