Compositional language processing for multilingual sentiment analysis

  1. Vilares Calvo, David
Supervised by:
  1. Miguel Á. Alonso Co-director
  2. Carlos Gómez Rodríguez Co-director

Defence university: Universidade da Coruña

Fecha de defensa: 21 June 2017

Committee:
  1. Yulan He Chair
  2. Javier Parapar Secretary
  3. Alexandra Balahur Dobrescu Committee member
Department:
  1. Computer Science and Information Technologies

Type: Thesis

Teseo: 485934 DIALNET lock_openRUC editor

Abstract

This dissertation presents new approaches in the field of sentiment analysis and polarity classification, oriented towards obtaining the sentiment of a phrase, sentence or document from a natural language processing point of view. It makes a special emphasis on methods to handle semantic composionality, i. e. the ability to compound the sentiment of multiword phrases, where the global sentiment might be different or even opposite to the one coming from each of their their individual components; and the application of these methods to multilingual scenarios. On the one hand, we introduce knowledge-based approaches to calculate the semantic orientation at the sentence level, that can handle different phenomena for the purpose at hand (e. g. negation, intensification or adversative subordinate clauses). On the other hand, we describe how to build machine learning models to perform polarity classification from a different perspective, combining linguistic (lexical, syntactic and semantic) knowledge, with an emphasis in noisy and micro-texts. Experiments on standard corpora and international evaluation campaigns show the competitiveness of the methods here proposed, in monolingual, multilingual and code-switching scenarios. The contributions presented in the thesis have potential applications in the era of the Web 2.0 and social media, such as being able to determine what is the view of society about products, celebrities or events, identify their strengths and weaknesses or monitor how these opinions evolve over time. We also show how some of the proposed models can be useful for other data analysis tasks.