Relevance-based language modelsnew estimations and applications

  1. Parapar, Javier
Supervised by:
  1. Álvaro Barreiro García Director

Defence university: Universidade da Coruña

Fecha de defensa: 12 July 2013

Committee:
  1. Fabio Crestani Chair
  2. José Luis Freire Nistal Secretary
  3. Leif Azzopardi Committee member
  4. David Enrique Losada Carril Committee member
  5. Pablo Castells Azpilicueta Committee member
Department:
  1. Computer Science and Information Technologies

Type: Thesis

Teseo: 338145 DIALNET lock_openRUC editor

Abstract

Relevance-Based Language Models introduced in the Language Modelling framework the concept of relevance, which is explicit in other retrieval models such as the Probabilistic models. Relevance Models have been mainly used for a specific task within Information Retrieval called Pseudo-Relevance Feedback, a kind of local query expansion technique where relevance is assumed over a top of documents from the initial retrieval and where those documents are used to select expansion terms for the original query and produce a, hopefully more effective, second retrieval. In this thesis we investigate some new estimations for Relevance Models for both Pseudo-Relevance Feedback and other tasks beyond retrieval, particularly, constrained text clustering and item recommendation in Recommender Systems. We study the benefits of our proposals for those tasks in comparison with existing estimations. This new modellings are able not only to improve the effectiveness of the existing estimations and methods but also to outperform their robustness, a critical factor when dealing with Pseudo-Relevance Feedback methods. These objectives are pursued by different means: promoting divergent terms in the estimation of the Relevance Models, presenting new cluster-based retrieval models, introducing new methods for automatically determine the size of the pseudo-relevant set on a query-basis, and originally producing new modellings under the Relevance-Based Language Modelling framework for the constrained text clustering and the item recommendation problems.