Histoire numérique et l’historiographie

Collecting, analysing and visualising documents in the political domain

1 Octobre 2020

Online lecture by Sara Tonelli, head of the Digital Humanities group at Fondazione Bruno Kessler (Trento).



In this talk, Sara Tonelli will present the main challenges in dealing with documents in the political domain, discussing how agreement and disagreement between two opponents can be automatically captured. The task is particularly challenging when data are not in dialogical format and when the goal is to understand the political position of two opponents starting from monological corpora. She will present the experiments carried out using machine learning and linguistically-motivated features on the documents of Kennedy and Nixon presidential campaign in 1960.

She will also introduce two projects dealing with Italian digital archives in the political domain: Alcide De Gasperi’s collection of public documents and the digital edition of his letters. She will describe the pipeline adopted, from digitalization to data exploration, the annotation of the corpus and some preliminary analyses of the content of the archives, as well as present the architecture created to transcribe the letters, perform quality control and publish them online on the project portal. Her analyses shows how distant reading can complement close one by comparing the content of the two collections and providing new insights into De Gasperi’s life.


Sara Tonelli is the head of the Digital Humanities group at Fondazione Bruno Kessler (Trento) since 2013. She is currently involved in the H2020 PERCEPTIONS project, related to the perception of EU and the narratives around migration to EU. She has previously been involved in several digital humanities projects such as the following:

She is also adjunct professor of Language Interfaces (jointly with Daniele Falavigna) at the Dept. of Psychology and Cognitive Science, University of Trento.

She was involved in the past in several European project: Pescado (FP7 - keyword extraction), Terence (FP7 - event-based text simplification), NewsReader (FP7 - event extration and semantic role recognition), SIMPATICO (H2020 - text simplification in the administrative domain), HATEMETER (REC - social media monitoring for islamopohobia detection), CREEP (EitDIGITAL - Cyberbullying detection). 


Thursday, 1 October 2020


Online, via Webex. Registrations welcome under following email address: vanessa.napolitano@ext.uni.lu