How to write a biography in the digital age without getting lost in a sea of data

As online personal information grows exponentially, biographers may need artificial intelligence technology to sort through it all

Biblioteca en Antalya
A library in Antalya, Turkey; October 28, 2022.frantic (Alamy Stock Photo)

There is someone out there who knows our lives inside out. They keep memories that we have long forgotten. They are aware of our conversations, our whereabouts, our observations and the things that captured our attention every day for the last few years. This someone or something, in all likelihood, understands us better than we comprehend ourselves.

Every click, every screen scroll and every keystroke adds to our real-time digital autobiography. It includes hundreds of WhatsApp messages, dozens of photos, social media interactions and even location and search history data stored every day. It’s an unparalleled biographical trail that reshapes how human lives are documented and narrated.

The very first biographies, such as Plutarch’s Parallel Lives (probably written at the beginning of the second century AD) or the medieval hagiographies about the lives of the saints, were often based on long-ago testimonies and even mythological anecdotes. St. Augustine’s Confessions (398 AD) laid the foundation for the autobiography genre by recounting his life from childhood to his conversion to Christianity. In this book, St. Augustine engages in a conversation with God using the second person point of view, which interestingly resembles the approach of Vida de Arcadio [The Life of Arcadio], a recent memoir by journalist and author Arcadi Espada. Rather than relying solely on memory, Espada brings forth his life story using meticulously archived documentary material — from train tickets and underlined news clippings to voice recordings.

“My book also represents a method,” said Espada. “I use the same approach in my journalism. When someone tells me what they remember about a specific day, I immediately cross-reference it with other available information. No doubt, many of our own memories are imperfect and inaccurate.” Espada still holds onto a vast amount of information about his life, including almost 200,000 emails. As time goes on, it will become increasingly difficult to review all this content, so he now relies on technology for help. “Now we have artificial intelligence for that,” he said. “It’s a topic that has always been on my mind — how can technology evolve writing?”

The sheer quantity of information we generate throughout our lifetime is incalculable, but evidence shows a daily surge in the creation of personal data. In one of his short stories, Jorge Luís Borges introduced a character named Funes el Memorioso [Funes the Memorious]. After falling off a horse, Funes suddenly possessed an extraordinary memory capable of recalling even the minutest details, like every hair of the horse’s mane. One can’t help but wonder how long it would take for someone born in the digital age, or their biographer, to thoroughly examine and absorb the immense amount of information collected throughout 80 years of life.

“Contrary to what you might think, having more digital data doesn’t actually make it easier to write biographies,” said Viktor Mayer-Schönberger, a professor of internet governance and regulation at the University of Oxford, in the United Kingdom. “The most important thing for new biographers won’t be the amount of material they have, but rather how much they can leave out because it’s not that relevant to the bigger picture,” he said.

Artificial intelligence (AI) and data analysis technology are crucial for finding important topics or events in the vast amount of information left by individuals on digital media. Franco Moretti, a historian and literary critic, coined the term “distant reading” to describe a literary analysis methodology that uses quantitative and computational techniques to analyze extensive amounts of text. “Biographers usually look for information about the typical biographical milestones — early childhood, parents, school, career beginnings, accidents and such. The new abundance of data just adds some interesting color here and there,” said Moretti. He thinks AI has the potential to create realistic stories based on data and human expectations, but realistic is not necessarily true. “What scares me is the thought that artificial intelligence could flood us with ‘plausible’ biographies that aren’t necessarily based on the truth.”

To illustrate how AI could be practically applied, imagine the hypothetical scenario of a writer working on a biography of environmental activist Greta Thunberg, who is poised to become one of the most influential figures of our time. In this hypothetical situation, the biographer gains access to all of Thunberg’s WhatsApp messages in 2023. By utilizing natural language processing tools available online, the biographer can analyze things like word and emoji usage patterns, as well as the frequency and duration of conversations. This analysis would provide valuable insights and help the biographer determine how positive or negative words were used during specific periods of her life. It would also help identify the individuals who received more attention from Thunberg.

New technologies will help verify the authenticity of internet data. “Artificial intelligence will actually help us figure out things like who the actual author of a specific text is,” said Mayer-Schönberger. “This technology can be very helpful and might just help us answer the question of who actually wrote the works of Shakespeare.”

Ricard Martínez, director of the privacy and digital transformation program at the University of Valencia (Spain), says that writers must balance an individual’s privacy with the need to gather information for the biography. Similar to personal diaries and correspondence, writers must obtain permission to access specific information. “If the data is stored in private environments like emails or WhatsApp, a writer needs the owner’s consent. And if the person has passed away, consent is needed from the heirs or immediate family members,” said Martínez. The dividing line between thorough research and an invasion of privacy in a digitally-based biography is extremely thin, says Martínez, and intrusion into private environments can be seen as a privacy violation and may even be against the law.

Another concern is the uncertainty surrounding how long the digital information we leave behind will last. “The majority of data losses are accidental rather than systematic. Biographers understand that they are piecing together a puzzle, with some missing parts. As long as these missing pieces are mostly random, the overall picture will not be significantly affected,” said Mayer-Schönberger.

If they rely solely on the information that people intentionally share on social media, where there is pressure to present an idealized life, biographers may encounter an inauthentic and unbalanced portrayal of the subject. “Attempting to portray an ideal life is not limited to the digital age. Throughout history, individuals have tried to present themselves in a favorable light. This extends to written correspondence, diaries, speeches and similar forms of communication,” said Mayer-Schönberger. “In fact, the abundance of data gathered today makes it even more challenging to maintain a flawless facade.”

Sign up for our weekly newsletter to get more English-language news coverage from EL PAÍS USA Edition

More information

Archived In

Recomendaciones EL PAÍS
Recomendaciones EL PAÍS