Rumors of the web's memory are greatly exaggerated.
Jeffrey Rosen has an engaging piece in the Times about privacy and the web that touches on issues of forgiveness and reputation and how the Internet has basically screwed that up for all of us; the upshot being that because your Facebook profile never really goes away, your sins are plastered on the world's largest wall for all to see forever.
Here's the thing. They're probably not. Forever, that is.
Digital archivists would love if Rosen's assumptions about online inertia could be taken at face value — anything that goes on the web, stays on the web unless acted on deliberately by a person or built-in self-destruct. But that's not the only way — or the primary way — things disappear from the web. Anyone who's looked for a link to a really cool article only to find that it just isn't there anymore knows that digital information is all too mortal. Incompetence and neglect are just as likely to kill a piece of web ephemera as a cease and desist order.
But some blogger who refuses to use permalinks or Yahoo news letting their old urls die is only the beginning of what keeps digital archivists up at night. If you want to know what else does, scour your home office and see if you can find a floppy disk. Found one? Okay, tell me what's on it.
See, the web is good at a lot of things, but legacy is not one of them. Technology in the past three or so decades has been a story of improving standards and formats but backwards compatibility? Not so much.
This is true for more than just physical formats. File formats are what will be the death of history on the web. Take the Flash vs. HTML5 debate. Let's say HTML5 wins. How many sites that render in Flash will convert all of their data to render in HTML5 and not just the most recent two years worth? Or six months?
As a web manager, I've overseen the overhaul of many a content management system, and there's always a compatibility issue which forces editors and technology teams to ask the same question. How much? How much will it cost (in time and money) to convert how much information? Do we really want to bother reformatting 400 news stories that were published in 2000 to a whole new format on the off chance that someone will search for them? The answer is almost always no. And that's just 10 years.
We assume that formats like .jpg (that picture of you doing a kegstand) or .mp3 (that ill-advised phone message you left at 3am) or — I won't even pick a video format since they change every week — will be here forever because they've been around as long as we can remember consumer-friendly digital information. But the odds that your Facebook page will still be here in ten years — or will be readable in ten years — while not terrible, probably aren't as good as you think.
People have been talking about this problem for a long time now. And they still are. The problem we face may not be that the web never forgets, but that it is better at forgetting that any technology we've ever used.