How to leverage the vast potential of historical newspapers

09-201708-2020

impresso – Media monitoring the past

Historical newspapers are an essential source for scientific research, and new digitisation techniques can facilitate access to this material. But in practice, their use is often restricted by imperfect text recognition software, missing metadata and complicated search functions. These are the challenges being addressed by the research project impresso.

1 / 4

“impresso: Media monitoring of the past. Mining 200 years of historical newspapers” is a joint project run by the C²DH, the Digital Humanities Laboratory at the École polytechnique fédérale de Lausanne (EPFL) and the Institute of Computational Linguistics at the University of Zurich. This three-year project, funded by the Swiss National Science Foundation (SNSF), was launched in September 2017.

The aim of the project is to develop new research methods based on a digitised corpus of newspapers and journals from Switzerland, Luxembourg, France, Belgium and Germany, covering a period of nearly 200 years. These methods include optimising text recognition, improving the identification of people, institutions and places and enhancing this entity recognition by drawing on external data repositories. Researchers in Computational linguistics will also work on structuring digitised texts, enabling “distant reading” and providing multilingual search capabilities. One of the C²DH’s tasks will be to develop a user interface which gives access to these new tools.

“If these historical sources are to be used for academic purposes, it is vital to provide information about the origins of the data and the quality of automatically generated annotations,” explains Dr Marten Düring, coordinator of the project at the C²DH. He sees this “transparency”, as well as the principle of “generosity” – providing users with additional avenues to extend their research – as essential for the design of the interface. The project employs an interdisciplinary approach, with historians, computational linguists and designers working closely together.

Regular workshops are being held so that a panel of researchers can give feedback based on their own practical experience. A C²DH post-doctoral research project on resistance movements to the idea of European unification in the late 19th and early 20th centuries will also help contribute to the development of the new tool. Finally, project findings will be incorporated into teaching programmes at the University of Lausanne.

Lectures and presentations

Bunout, Estelle. “impresso: Media monitoring of the past. Mining 200 years of historical newspapers or how to process data from media archives (and deal with digital bias)“. Workshop on the creation of an international Data for History consortium, 23-24/11/2017, Lyon. 

https://www.c2dh.uni.lu/ ... new-consortium-enable-interoperability-historical-data

Düring, Marten; Bunout, Estelle: Introducing impresso. Media Monitoring of the Past. Forum Z: A new narrative for Europe: Quo Vadis?, 13/10/2017, Esch-sur-Alzette.

https://www.c2dh.uni.lu/forum-z/new-narrative-europe-quo-vadis

see also

BLIZAAR

Hybrid Visualisation of Dynamic Multilayer Graphs

BLIZAAR is a research project on novel visualisation techniques for data generated in the fields of humanities and biology.

read more

Digital History and Hermeneutics DTU

Doctoral Training Unit on Digital History and Hermeneutics: Where new ideas thrive!

The field of digital humanities opens up a host of new possibilities for advancing knowledge in traditional humanities disciplines.

read more

FAMOSO - Fabricating Modern Societies

The “Age of Steel” in Luxembourg revisited. Technologies of utopian capitalism and the making of national identity

The idea for the FAMOSO projects originated in May 2010 when Dr Karin Priem, the principal investigator of the FAMOSO projects, was introduced to a huge holding of 2,251 glass plates archived at the Centre national de l’audiovisuel (CNA) in Dudelange.

read more

histograph

Graph-based exploration and crowd-based indexation for multimedia collections

Multimedia collections can provide researchers and the general public with vast quantities of written and audiovisual material – but exploring these reams of data in an effective way is not always an easy task.

read more

The Luxembourg nation and the Jews (1930s to 1950s)

A microhistorical approach

The persecution of the Jews by the Nazi regime occurred in occupied Luxembourg virtually throughout the Second World War.

read more

RANKE 2.0

Digital source criticism in the 21st century

Source criticism is a vital part of historians’ work, and it generally features in all historiography syllabuses.

read more

Éischte Weltkrich

A virtual exhibition on the Great War in Luxembourg

The initiatives for the Great War Centenary have offered an unprecedented chance to re-engage with an important but understudied period in Luxembourgish history. Based on research carried out by historians at the University of Luxembourg and with the support of the Ministry of State, in February 2016 the C²DH began developing a digital exhibition on the Great War in Luxembourg.

read more