CHAMPAIGN, Ill. — As Russian bomb attacks destroy cities in Ukraine, graduates of the University of Illinois Urbana-Champaign School of Information Sciences are among volunteers from around the world working to preserve Ukrainian cultural heritage online.
Quinn Dombrowski – a 2009 information sciences alumna and an academic technology specialist at Stanford University – is one of three founders of the Saving Ukrainian Cultural Heritage Online project, along with Anna Kijas of Tufts University and Sebastian Majstorovic of the Austrian Center for Digital Humanities and Cultural Heritage. The group connected on Twitter and launched the project on March 1. The project now has 1,300 volunteers from all over the world, including librarians, teachers, historians and others.
“There were a lot of people feeling like we were – doomscrolling the news and feeling terrible and hopeless and wishing we could do something besides donating. SUCHO provided an outlet for much of that anxiety,” Dombrowski said.
The project focuses on websites for libraries, archives, museums and some government sites. Volunteers use the Internet Archive Wayback Machine, as well as the open-source WebRecorder web archiving software, to scour websites and capture information. They’ve added hundreds of thousands of links to the Wayback Machine. Amazon Web Services donated data storage to the project.
Dombrowski described the work as a race against time. In the first few days of their work, Majstorovic captured the material from the website of the state archives of Kharkiv. Within hours of finishing, the website went down shortly after the physical site sustained damage.
When a site goes down, Dombrowski said she and the other team members don’t know if the cause is a power outage, a cyberattack or the physical destruction of servers. The digital records preserved by the project may end up being the only records left for some institutions whose physical records are destroyed, she said.
“They may be records they can use to rebuild websites. Even if we don’t have scans or images but we have metadata – about the holdings of a museum, for example – it may be valuable, too. If there is looting and things end up on the black market, it can be evidence of war crimes,” Dombrowski said.
Anyone can suggest a website to be preserved, and the volunteers have used Wikipedia to find Ukrainian cultural websites. Some are even virtually searching the streets of cities under siege, using Google maps to look for any building with a sign indicating it is a museum or library.
A massive working spreadsheet contains the names and URLs of the sites to be preserved and their status – submitted, in progress, finished or site down, for example. The project volunteers have archived more than 3,500 sites so far, Dombrowski said.
Information sciences professor Zoe LeBlanc teaches web scraping and uses it in her research. She’s been writing custom code to capture material that is difficult to gather with an automated tool like Browsertrix – for example, library catalogs or sites built with different technologies or that contain many PDFs and other embedded files.
Browsertrix can follow the links within a website and capture all the pages that are linked. It has been crucial to the project’s work, LeBlanc said.
“Without it, we would need 10 times the number of volunteers and people would have to stop doing their full-time jobs,” LeBlanc said. “Some of the tools were developed in the last year or two. They are very recent in terms of what is even possible for archiving websites and saving material from them.”
The work also has required custom programming to get around security measures for some websites, Dombrowski said.
“Ironically, web archiving from a server perspective looks not dissimilar to a cyberattack. Our automated software sometimes gets blocked,” she said.
There are features on some websites that automated systems can’t capture and that require human interaction, such as virtual tours. Volunteers use the WebRecorder browser plug-in to preserve those pages as they manually click through the sites.
In addition to searching for new sites and archiving the material, volunteers ensure all of the material on the websites has been captured, add metadata, monitor the sites that have gone offline and prioritize capturing their material if they come back online, and monitor the fighting in Ukraine to prioritize work in locations under attack.
“It’s an enormous undertaking. As much as we’re using technology, it’s a human-powered initiative,” said LeBlanc, who also has helped create a template for metadata, develop privacy and security policies and set up processes for prioritizing work and deciding which material to capture and how to store it.
Dena Strong, a senior information design specialist for Technology Services and a 2014 information sciences alumna, began working on the project a few days after it launched, when volunteers were flooding in from time zones around the world. She’s now the project’s community engagement coordinator.
Volunteers use more than 16 different Slack channels to talk about various aspects of the work and how to solve specific problems. Strong monitors the conversations and works as a traffic controller for information, periodically updating dozens of documents in Google Drive and GitHub to share processes and solutions, as well as training volunteers and managing and documenting workflow.
“I’ve never seen anything move this fast. We were inventing the tools as we went,” Strong said. “If you had asked me in the middle of February about taking the entire cultural heritage of Ukraine and trying to rescue it in three weeks with an all-volunteer team, I would have said you need to give us three years.”
Strong said her experience with the COVID-19 SHIELD program gave her the confidence that this project could quickly collect and store vast amounts of information.
While Dombrowski said she is inspired by the volunteers and what they’ve been able to accomplish, their emergency effort to document Ukraine’s cultural heritage reflects a failure of infrastructure and of planning.
“Cultural institutions need to do a more proactive job of archiving, to make sure there are archives of at least the most significant items,” she said.
A backup copy of digital material on another server in the same city isn’t helpful if the city is in a war zone. The crisis has led to conversations among groups including UNESCO on how to archive cultural material in the future, she said.