Methods for Extracting Meta-Information from bibliographic databases

verfasst von

Maria Biryukov

veröffentlicht am

15 November 2010

Due to intensive growth of the electronically available publications, bibliographic databases have become widespread. They cover a large variety of knowledge fields and provide a fast access to the wide variety of data. At the same time they contain a wealth of hidden knowledge that requires steps of extra processing in order to infer it. In this work we focus on extraction of such meta knowledge from the research bibliographic databases by looking at them from sociolinguistic, text mining and bibliometric perspectives. We choose the Digital Library and Bibliographic Database as a testbed for our experiments.
In the framework of the sociolinguistic analysis we build a statistical system for the language identification of personal names. We show also that extension of a purely statistical model with the co-authors network boosts the system's performance.
In the text mining scenario, we perform a number of experiments that focus on topic identification and ranking. While our topic detection approach remains generic and can be used for any kind of textual data, the topic ranking metrics are built upon the information provided by the bibliographic databases.
The goal of our bibliometric study is to create a researcher's profile on DBLP and analyze some of the research communities defined by the different conferences, in terms of the publication activity, interdisciplinarity of research, collaboration trends and population stability. We also aim at exploring to what extent these aspects correlate with the conference rank.
Each of the above topics constitutes a method of meta information extraction from bibliographic databases and other similarly structured data sources.

Diese Publikation in unserem institutionellen Repositorium (orbi.lu) anzeigen.

Autor

Maria Biryukov

Maria was a Research associate

Mehr zu dieser Person →

Methods for Extracting Meta-Information from bibliographic databases

Autor

Tags

22 April 2025

Gilbert Trausch. Une vie dédiée à l'histoire (1931-2018)

1 April 2025

Multilingual Word Embedding and Linguistic Linked Open Data for Tracing Semantic Change

Forschungsgebiete

Public history

Luxemburgische Zeitgeschichte

Europäische Zeitgeschichte

Digitale Geschichte und Historiographie

Methods for Extracting Meta-Information from bibliographic databases

Autor

Tags

Ähnliche Artikel

22 April 2025

Gilbert Trausch. Une vie dédiée à l'histoire (1931-2018)

1 April 2025

Multilingual Word Embedding and Linguistic Linked Open Data for Tracing Semantic Change

Forschungsgebiete

Public history

Luxemburgische Zeitgeschichte

Europäische Zeitgeschichte

Digitale Geschichte und Historiographie