Some Modest Criticisms of the Internet Archive

I'm hestitant to even write this post. The Internet Archive does society an immense service by archiving digital content that might otherwise be lost to history. It does this at a non-profit, run by a small group of passionate people and volunteers. How could you criticize that?

Nevertheless, there are issues with viewing a lot of the content in the Internet Archive. We should be able to talk about imperfections even if the people involved have good intent. The "what" and the "why" and "how" of a problem are separate things. We should be able to talk about them, separately.

So even though the Internet Archive is wonderful, and the reasons why these issues exist are understandable; probably even based on the right trade-off, I still want to talk about them.

Ok, with that long disclaimer, let's dig into some of the problems I run into when viewing content on the Internet Archive.

This is not about the wayback machine

I just want to make that clear here. The criticisms in this article are not in reference to the wayback machine, which is how you view historical snapshots of websites. This part of the Internet Archive surely has problems too. I'm not sure to what extent these issue are something they can control, however. Archiving websites is a very hard problem. My expectations for it are different from the other type of content that they have. There will be no further mention of the wayback machine in this article for that reason.

Lack of caching in games

The Archive added a large collection of games in recent years. One example is DOS based games that are embedded with an in-browser DOSBox, a DOS emulator. This makes the Internet Archive the easiest way to play a lot of these old games. Of course you can download the files locally, install DOSBox (if it works on your computer) and play them that way. But doing it directly in the browser *could* be a far easier experience.

One example is the classic interactive fiction game Night Trap

Night Trap on the Internet Archive

This is an interactive fiction, full motion video game. The videos naturally make the game size larger than most old games, over a gigabyte. So it takes ~10 minutes to download. However, the page does not cache the game content, so you have to redownload it every time you load the page. This makes it impractical to actually play the game through the IA.

This should be fixable by using the appropriate cache headers in response to these assets. Or if that's not possible for some reason, a service worker could be used in the frontend to cache them.

Lack of saves in games

Even if you're playing smaller games where the download time isn't a blocker to playing them, the lack of support for saving your spot still makes it impractical to actually play them. I guess this is not something that the builtin DOSBox plugin gives them for free, or perhaps it hasn't been configured to save in localStorage or IndexedDB. This is something else that seems like a fairly small thing to fix.

Lack of saves in video content

The Internet Archive has a lot of video content, like old movies and TV shows. Users have come to expect in streaming that their spot will be saved, but on the Internet Archive you need to watch all the way through or remember where you left off. This is another item that should be simple to implement.

Lack of gamepad support in games

This is another issue that must come down to DOSBox (and their other browser emulators) not providing support by default. There is a gamepad API that is available in most modern browsers. For many games, using the keyboard and mouse makes them a bit harder to play.

Deprioritization of the frontend

I think you probably noticed that most of my issues are, in some way, related to the frontend. The Archive seems to prioritize archiving content over making content accessible. And as I stated in the opening paragraph, that's probably the right tradeoff. Or even if it's not, it's the tradeoff that they think is right. After all, once content is gone, it's gone. You can always improve accessibility and viewability in the future. You can never recover lost data.

Nevertheless, I think a small investment in the frontend and fixing these small issues would make the IA a much better place to not just host blobs of content, but to act as a hub for experiencing historic games and video content.