Nanna Bonde Thylstrup, expert in digital memory: ‘We could lose part of our memory because a file format becomes obsolete’

The Danish researcher speaks with EL PAÍS about how to understand the fragility of digital memory within the apparent abundance of the internet age

Nanna Bonde Thylstrup — a professor at the University of Copenhagen — researches data loss in the digital age. In June 2023, she published an article in The New York Times titled “The World’s Digital Memory Is at Risk.” This year, she has received one of the most important grants from the European Union to study how, in the era of digital abundance, our societies’ past is in danger.

Going paperless has unforeseen implications regarding how discussions, private messages and business documents can be preserved. It’s a challenge of surprising complexity: while it may appear that society leaves an infinite digital trail, this really isn’t the case.

Professor Thylstrup’s interview with EL PAÍS took place in Barcelona, where the 42-year-old Copenhagen-born researcher was participating in a conference linked to the exhibition AI: Artificial Intelligence at the Center of Contemporary Culture of Barcelona (CCCB).

Question. There is more and more information, more data, than ever before. What should be preserved?

Answer. It’s a political decision that each country must make. More and more data of public interest isn’t available. The problem isn’t just whose information it is: it’s also about who has access. And I’m not just talking about government information: it can also be data owned by Amazon, and not just what it stores on its servers. It could be data that they produce themselves, or that people produce for them — such as reviews or descriptions — [that the company] owns. Another preservation issue is which organizations should be able to access and preserve this type of information for historical purposes.

Q. Why is it such a complex issue?

A. [We’re still trying to understand] the challenges and opportunities of digital societies, which are related to accumulating more and more data. This gives us, on the one hand, benefits — such as advances in health — and, on the other hand, challenges, such as surveillance, or the extraction of data to sell us things. If we only focus on that accumulation, we run the risk of losing sight of the fact that [the digital world] is super volatile and fragile and needs constant conservation if we want it to remain accessible. File formats go out of fashion; there’s obsolescence of formats; platforms shut down. We don’t even have a vocabulary to talk about these issues: what do we mean when we say that a platform “closes” and the data “disappears”? It depends, for example, if there’s a merger with another company: the data could still be there, but we cannot access it. The data could actually still be there and be used without us knowing.

Q. What’s missing to be able to discuss this topic further?

A. We haven’t debated enough — from a political point of view — about how to preserve our digital memory. Now, that doesn’t necessarily mean we have to keep everything. That’s not my position. But we need a qualified discussion about who and how we make decisions about what to keep and what to let go of. That’s why I highly value one idea in the [EU’s new] data protection regulations: that people should also have the right to be forgotten. [I don’t simply believe] that everything should be saved forever. We know that we don’t live in a time of information scarcity like before. At the same time, the information we have is incredibly volatile. We could lose part of our memory because a file format becomes obsolete.

Q. So what information should we keep?

A. That question is for those who’re in charge of archives. Who’s to know what will be interesting [and] historically valuable 30 years from now? There are the great events, but then also the everyday things, which [are usually] the most interesting for historians. It helps to understand day-to-day problems, [to get a sense of] how people lived in 1950, or 1830.

Q. For example?

A. Recently there was a controversy in Denmark. There’s a national app that’s used to facilitate the relationship between schools and parents. The [Ministry] of Culture has just prohibited the preservation of private messages from that app, which national archivists were [collecting]. It’s controversial. Historians have said [that this correspondence] will be useful in 100 years, when we have to understand how parenting changed with the introduction of digital technologies. [For instance], can we see gender patterns? We know that women [tend to be] in charge of [monitoring] those apps, even in an egalitarian society. We don’t see men there. Or [maybe we’d like to know] how the schools changed with Covid. That’s the challenge of archiving; that’s why archivists are experts in evaluating, they make the decisions about what goes in and what doesn’t. And that’s always a political decision, because they’re guardians of a cultural memory.

Q. In Spain, there’s a similar debate.

A. I’m not an expert in the Spanish system, but it seems that [Spain has] a similar approach to the Danish one. They keep certain websites and [domain extensions] and so on. Then, [the archivists] have a massive scan that runs through the network in a very general way, [to subsequently identify] key events. For example, if there’s a big football match or a terrorist attack, they intensify tracking. Then, they have [specific classification] called “political” or “electoral,” where they do massive tracking, specifically in politics. There’s also one called “risk.” They have more specialized focuses and more thematic tracking.

Q. Is this tracking taking place not only on internet search engines, but also on Instagram, or on instant-messaging services ?

A. Everywhere. For example, with Twitter, people reacted not only because of disagreements with Elon Musk’s strategy: there was also a great feeling of loss for the communities they had built there. An example is the so-called “Black Twitter,” which built an incredible archive and its own slang. The question is no longer just what happens to this cultural memory… [It’s also about the fact that] maybe you can’t access it. It’s basically a certain type of cultural memory that’s in the hands of a company, in this case Twitter.

We still have a feeling that these platforms will always be there. We don’t really think about mitigation strategies if they suddenly shut down, or decide to change, like Tumblr did with [the regulation of] pornographic content. They’re clearly private companies that have the right to run these communities however they want, because it’s within their purview. Then, we have organizations that set up something like counter-files. When Twitter started shutting down or dismantling [accounts], there were communities that said, “you need to counter-archive certain cases.”

Q. The archives don’t have agreements with these companies.

A. The problem is that [these firms] can change their technical access modes, so it becomes very difficult to trace [content from their platforms]. It’s one of the greatest challenges for archival institutions. They don’t have any agreements with these companies that allow them to do that, for the sake of research investigations, or [preserving] stories.

With newspapers or books in Denmark, we have a law that says that every time you publish something, it also has to go to the National Library. In my country, a website counts as one publication. But in and of itself, it’s unstable, because websites are updated and [aren’t the same as traditional] publications. Also, if there are elections and everything that happens is taking place on X or Instagram, it should be preserved, because it’s also an important part of the nation’s cultural memory. How can we understand Brexit without looking at what happened on Twitter or Facebook?

Q. Is the main concern that we don’t know what to preserve in general, or that we’re already losing so many things that it’s difficult to know what to keep and what to discard?

A. Both. Institutions decide what to keep, but sometimes, the conditions are difficult. We know that we want to save anything relevant to historically understanding an election… but the conditions for doing so are complicated, because private companies protect the data. So, the conditions complicate things for the institutions. Then, there are the slightly technical — but I think fundamental — questions about how, if we want to preserve something, do we differentiate what’s happening [in one region online] from what’s happening globally? Those are tough questions, too. But we’ve had these challenges before. The fundamental risk now is that conditions are poor for archivists to work professionally, because of [limited] access. The political challenge is: how do we organize our societies so that private companies don’t have the power to block access to something that’s of public interest? Then, there are material challenges around all of this… It’s not like a piece of paper that will be there in a hundred years. That’s a material challenge [that’s] linked to economic challenges, because companies make money with updates.

Sign up for our weekly newsletter to get more English-language news coverage from EL PAÍS USA Edition

Archived In