Relevance-based language modelsnew estimations and applications

  1. Parapar, Javier
Dirigida por:
  1. Álvaro Barreiro García Director

Universidad de defensa: Universidade da Coruña

Fecha de defensa: 12 de julio de 2013

Tribunal:
  1. Fabio Crestani Presidente/a
  2. José Luis Freire Nistal Secretario/a
  3. Leif Azzopardi Vocal
  4. David Enrique Losada Carril Vocal
  5. Pablo Castells Azpilicueta Vocal
Departamento:
  1. Ciencias de la Computación y Tecnologías de la Información

Tipo: Tesis

Teseo: 338145 DIALNET lock_openRUC editor

Resumen

Relevance-Based Language Models introduced in the Language Modelling framework the concept of relevance, which is explicit in other retrieval models such as the Probabilistic models. Relevance Models have been mainly used for a specific task within Information Retrieval called Pseudo-Relevance Feedback, a kind of local query expansion technique where relevance is assumed over a top of documents from the initial retrieval and where those documents are used to select expansion terms for the original query and produce a, hopefully more effective, second retrieval. In this thesis we investigate some new estimations for Relevance Models for both Pseudo-Relevance Feedback and other tasks beyond retrieval, particularly, constrained text clustering and item recommendation in Recommender Systems. We study the benefits of our proposals for those tasks in comparison with existing estimations. This new modellings are able not only to improve the effectiveness of the existing estimations and methods but also to outperform their robustness, a critical factor when dealing with Pseudo-Relevance Feedback methods. These objectives are pursued by different means: promoting divergent terms in the estimation of the Relevance Models, presenting new cluster-based retrieval models, introducing new methods for automatically determine the size of the pseudo-relevant set on a query-basis, and originally producing new modellings under the Relevance-Based Language Modelling framework for the constrained text clustering and the item recommendation problems.