505 Billion Pages and Counting

The Internet Archive's Wayback Machine is a wonderful resource, making some of the Internet's long-gone content available again when people want to access it at some point in the future. It's because of the Wayback Machine that I've been able to (slowly) piece together all of the show notes for the long-deleted Enough Podcast1, and it's been an important tool when looking back at content that people have deleted from their website to mask past statements. There's just one little problem with the service, though: it doesn't archive everything.

The Internet Archive

As of this writing, The Wayback Machine has roughly 505-billion pages archived. This is absolutely amazing. People will thank the team responsible for keeping this service alive for generations to come as there is still no good way for websites to be accessed after they've vanished from the public Internet. Given all the good that this service does, I'd like to encourage anybody who uses the service to make a donation every now and again to keep it going. Like 10Centuries, Archive.org is a non-profit organization that doesn't try to turn other people's content into gold. However, unlike 10Centuries, the service does not specifically ask permission to maintain a copy of a site nor does it make it easy to post an update of an article or site we do want preserved.

Which brings me to the crux of this blog post2, should 10Centuries have an opt-in feature that auto-submits content to Archive.org on the creator's behalf?

People are free to add pages to the Wayback Machine by manually submitting a URL, but this is another step that people need to do on their own, which can be a bit cumbersome for authors who future-date posts for later releases. There is a way for a web service such as 10Centuries to auto-submit new, public posts to the service by sending a simple GET request like http://web.archive.org/save/matigo.ca. So why not do it for everybody that wants it?

10Centuries has the goal to keep people's content online and in a safe, central location for a thousand years. Crazy as it may seem, it should be perfectly plausible to do so given the direction of the Internet and how certain systems are evolving. That said, it doesn't hurt to have a backup, and Archive.org's mission is very much aligned with that of 10Centuries. Will people want the feature, though?

  1. I still have about 90 episodes to find show notes for. I have the entire audio archive, but it's these pesky show notes that seem to have completely vanished from the Internet. Maybe Dan Benjamin has a copy from his 70Decibels acquisition. I should ask …

  2. This post should probably have been written on 10Forward (blog.10centuries.org), the "official" blog for 10Centuries, but here is good enough.