An Unsolicited Soliloquy on Dependency Parsing

  1. Anderson, Mark
Supervised by:
  1. Carlos Gómez Rodríguez Director

Defence university: Universidade da Coruña

Fecha de defensa: 17 September 2021

Committee:
  1. Lluís Padró Cirera Chair
  2. David Vilares Calvo Secretary
  3. Barbara Plank Committee member
Department:
  1. Computer Science and Information Technologies

Type: Thesis

Teseo: 673041 DIALNET lock_openRUC editor

Abstract

This thesis presents work on dependency parsing covering two distinct lines of research. The first aims to develop efficient parsers so that they can be fast enough to parse large amounts of data while still maintaining decent accuracy. We investigate two techniques to achieve this. The first is a cognitively-inspired method and the second uses a model distillation method. The first technique proved to be utterly dismal, while the second was somewhat of a success. The second line of research presented in this thesis evaluates parsers. This is also done in two ways. We aim to evaluate what causes variation in parsing performance for different algorithms and also different treebanks. This evaluation is grounded in dependency displacements (the directed distance between a dependent and its head) and the subsequent distributions associated with algorithms and the distributions found in treebanks. This work sheds some light on the variation in performance for both different algorithms and different treebanks. And the second part of this area focuses on the utility of part-of-speech tags when used with parsing systems and questions the standard position of assuming that they might help but they certainly won’t hurt.