Lexicometric and Informational Measures in Historical and Literary Corpora

Frequency values and distribution are considered to compute informational measures in multilingual historical and literary corpora.

How the frequency of words may be interpreted in the context of an informational analysis of textual corpora? To what extent the frequency values and distribution could be an indicator of the “amount of information”, the degree of “certainty” or of “informativeness” conveyed by a text? Are other factors, such as language and genre, influencing these informational measures? The paper addresses these questions from a comparative perspective using historical and literary multilingual corpora in Romanian, French, and English.