The invited guests, Andreas Fickers (C2DH), Josée Kirps (National Archives of Luxembourg), Jean-Marie Ottelé (industrie.lu) and Joëlle Weis (University of Luxembourg), discussed about the future of the archives, as well as about the challenges and opportunities of digitization. The conference was moderated by Pia Oppel from the radio 100,7. A podcast (in Luxembourgish) with a commentary and some excerpts of the discussion can be found here.
Indeed, the digitization is a process that cannot be stopped and even less reversed. Historians are more and more confronted with digital sources. Though digitization offers the advantage of making large amounts of written and audiovisual sources available to the public, it needs to take place in the right framework with the right tools and the necessary skills to use them. Numerous web pages and digital archives already exist, such as the well-known crowdsourcing platform Europeana, or the project eLuxemburgensia of the Bibliothèque nationale de Luxembourg (BNL). However, the digitization is also slowed down by numerous problems and conflicting laws, such as the one about data protection. Moreover, documents without the necessary metadata aren’t of any use to the historians, and a digitized source loses its ‘materiality’ when changing its state from analogue to digital, though this is different for digital born documents. In both cases, nevertheless, remains the question of how to preserve and archive them and what legal framework is needed.
These questions – and much more – were at the heart of the debate that took place in the Casino Luxembourg. It is not my aim here to summarize all that has been said (which also risks not to do justice to every intervention that has been made), but I would like to give some thoughts based on the questions and issues that were raised.
A new paradigm
Digitization changes the way historians work. The user can comfortably stay at home and search for sources online. Yet, already for this very elementary step, skills and a critical spirit are of the essence: How to use the search engine? How are the results filtered, and what arithmetic is hidden behind the ‘relevance’ classification? What people or institutions are running the digital archive? A new way of doing source critique is necessary: Are the metadata complete? Where did the sources originate? And has the concept of ‘original’ still a meaning? Basically, the digital sources are composed of ones and zeroes (binary system). What would Walter Benjamin say in the era of the digitally reproducible sources (or works of arts, as in his essay)? He might not change his opinion and say that a document still loses its ‘aura’ when it is turned into a binary code. But what about digital born sources? Is there an original when we can copy them as much as we want without changing their aspect? This might be more difficult for Benjamin to answer. But then again, he is (unfortunately) not around anymore to reflect on it.
Digitization also changes the way archives and archivists work, an issue that has been raised by Karin Priem in her blogpost commenting the first ForumZ. Indeed, archival institutions are confronted with new challenges and missions. Public institutions, at least in Luxembourg, face problems adapting to the digital turn, not necessarily due to insufficient funding, but because there are not enough archivists, and even less those trained specifically in digital archives and digitization of sources. Yet, addressing these issues is extremely important, as historians are strongly dependent on the good work of archivists.
The ‘third degree selection’
A further problem arises when historians become too comfortable and believe that what digital archives offer is all we need for our work. Regardless of how much we might get overwhelmed by the amount of digital sources we are confronted with, it is important to keep in mind that it is only a very small percentage of what really exists ‘out there’. If digital archives make sources accessible to a large public and are even able to enhance the experience with some interactive tools, historians still need to go to the archives (or the libraries), especially if they like to get an impression of the ‘materiality’ for external source critique. Furthermore, digital archives are the result of what I would call a ‘second degree selection’, or even a ‘third degree selection’. What archival institutions get might be only a fraction of what existed, because a public administration does not keep all the sources that it produces, some get destroyed. A private donor might also refrain from handing all his personal documents over to the archives, though the law on data protection should limit these fears. After this ‘first degree’, the archive, then, might choose what is worthy of being kept for future generations (second degree). In Luxembourg, this procedure is not necessarily based on transparent criteria, already because the existing legal framework proves to be insufficient (there is, in fact, no specific law for archiving). Now, when an institution launches a digitization project of its inventory, it chooses only what it considers as being the most important sources, because of limited personnel and budget (third degree). This model certainly simplifies a usually more complex process, but the point I would like to make is that what we see in digital archives is really not much, and the result of deliberate choices that have been made (though documents can also get unwillingly lost). Furthermore, the legal framework plays an important role, as it can make the choices that have been made far more transparent. In Luxembourg, the planned law on archiving, presented in 2015, wishes to address this question (the digital aspect is completely missing, though).
Preserving archives and promoting accessibility: two sides of the same coin?
Another question that arises when creating digital archives touches upon the dual issue of access and preservation. In the best-case scenario, both goals can be easily reconciled. In reality, it is not as easy as one could think. A document can be digitized, but it is not necessarily accessible to the general public, because of data protection or copyright law. When, for instance, an institution chooses to digitize sources and focuses on accessibility, then it will use this logic in choosing sources. It might also focus on pure preservation, but then, a digital archive will not be very useful to whoever needs to access it for his research. However, accessibility is not limited to making archives available to everyone, but extends to questions of interface and search engine. The OCR procedure (Optical Character Recognition) has the advantage of making the content of written documents searchable (which is the case of the newspapers on eLuxemburgensia), but the margin of error is extremely high: words are misread by the software, or the font cannot be recognized, especially in the case of handwritten texts. It might be even possible that the search results are not complete, precisely because the keyword was not recognized in all the documents where it appeared.
Accessibility does not necessarily favor a systematic screening of sources, too. When I was looking through parliamentary debates for eventual discussions about certain topics and laws, I went to the library and consulted the records of the Chambre des Députés, which enabled me to do a thorough search, even though most records are available online on the official site (which is a good example of a badly structured page). I preferred to trust my eyes rather than the codes and mostly consulted the digital archives when I was looking for something more specific.
Concerning the question of preservation, we still need to find a way of saving sources in a long-term perspective. There are some aspects to take into consideration. First, a technical-material aspect: Hard disks can fail, an issue that can be partly solved by mirroring the data and creating backups. Yet, ‘analogue’ media such as microfiches are much more durable. Second, a quantitative aspect: digitizing sources creates huge amounts of data, which also vary depending on the quality of the scans. Finally, an aspect of compatibility: formats change and software evolve. If digitized sources should still be accessible in terms of format in the far future, they have to be converted to newer formats, or else, the efforts invested in making them available to a large public will have been in vain. There are certainly other elements that I did not mention (for instance the ecological one, in terms of energy consumption), but the three I’ve mentioned are, in my opinion, the most important in the present context.
The power of the crowd
With the right framework, crowd-sourcing might be a great way of creating digital archives, as it is the case of Europeana 1914-1918 with the First World War, or, in the Luxembourgish context, the privately run portal industrie.lu on the industrial heritage, with a forum where users can exchange information. Indeed, many sources are even digitized without ever being made publicly accessible, as it is the case of researchers who take pictures of the consulted documents for later analysis. Crowd-sourcing makes it even possible to digitize sources that might not have been handed over to the archives in the traditional way (such as personal collections). However, there is a risk that those who participate in the crowd-sourcing do not necessarily possess the right skills or have not been made aware of the question of metadata. Thus, if crowd-sourcing takes place, this must be in a specific framework, with historians and specialists giving a helping hand and controlling the validity of the data, but also the quality of the digitization.
Another possibility is to create digital archives by reuniting the collections of different institutions, in a collective effort, via a transnational collaboration. This is a different kind of crowd-sourcing, limited to existing bodies that are specialized in their domain and have the necessary skills, but might refrain from doing it alone. An example is EUscreen, funded by the European Commission, which offers a collection of audiovisual material from public broadcasters from all over Europe. The clips from EUscreen are also provided to Europeana, thus enriching the latter’s collection where pictures and texts are still quite dominant.
A long road ahead
My aim was to present thoughts and reflections on some of the questions related to digitization and raised at the conference. Unsurprisingly, there are still many more important questions which I did not discuss further. One issue, which appeared at some points, is about skills training of future generations. In fact, children and students need to learn at school about digital source critique and how to use digital archives. Not everyone knows what metadata is, or what pages can be consulted, and many do not give much thought about how search engines such as Google filter the results. Indeed, the role of schools and universities is precisely to prepare students for the future and to shed some light on the black boxes of the digital realm.