ARGUMENT MINING AND ITS APPLICATIONS IN POLITICAL DEBATES

Presidential debates are significant moments in the history of presidential campaigns.
In these debates, candidates are challenged to discuss the main contemporary and
historical issues in the country and attempt to persuade the voters to their benefit.
These debates offer a legitimate ground for argumentative analysis to investigate
political discourse argument structure and strategy.
The recent advances in machine learning and Natural Language Processing (NLP)
algorithms with the rise of deep learning have revolutionized many natural language
applications, and argument analysis from textual resources is no exception.
This dissertation targets argument mining from political debates data, a platform
rifled with the arguments put forward by politicians to convince a general public in
voting for them and discourage them from being appealed by the other candidates.
The main contributions of the thesis are: i) Creation, release and reliability assessment
of a valuable resource for argumentation research. ii) Implementation of a
complete argument mining pipeline applying cutting-edge technologies in NLP research.
iii) Launching of a demo tool for argumentative analysis of political debates.
The original dataset is composed of the transcripts of 41 presidential election
debates in the U.S. from 1960 to 2016.
Beside argument extraction from political debates, this research also aims at investigating
the practical applications of argument structure extraction, such as fallacious
argument classification and argument retrieval. In order to apply supervised
machine learning and NLP methods to the data, an excessive annotation study has
been conducted on the data and led to the creation of a unique dataset with argument
structures composed of argument components (i.e., claim and premise) and argument
relations (i.e., support and attack). This dataset includes also another annotation
layer with six fallacious argument categories and 14 sub-categories annotated
on the debates. The final dataset is annotated with 32,296 argument components
(i.e., 16,982 claims and 15,314 premises) and 25,012 relations (i.e., 3,723 attacks and
21,289 supports), and 1628 fallacious arguments.
As the methodological approach, a complete argument mining pipeline is designed
and implemented, composed of the two main stages of argument component
detection and argument relation prediction. Each stage takes advantage of various
NLP models outperforming standard baselines in the area, with an average F-score
of 0.63 for argument components classification and 0.68 for argument relation classification.
Additionally, DISPUTool, an argumentative analysis online tool, is developed
as proof-of-concept. DISPUTool incorporates two main functionalities. Firstly, it
provides the possibility of exploring the arguments which exist in the dataset. And
secondly, it allows for extracting arguments from text segments inserted by the user
leveraging the embedded trained model.

Show this publication on our institutional repository (orbi.lu).