An Unsolicited Soliloquy on Dependency Parsing

Anderson, Mark

An Unsolicited Soliloquy on Dependency Parsing

Anderson, Mark

Supervised by:

Carlos Gómez Rodríguez Director

Defence university: Universidade da Coruña

Fecha de defensa: 17 September 2021

Committee:

Lluís Padró Cirera Chair
David Vilares Calvo Secretary
Barbara Plank Committee member

Department:

Computer Science and Information Technologies

Type: Thesis

Teseo: 673041 DIALNET RUC editor

Abstract

This thesis presents work on dependency parsing covering two distinct lines of research. The first aims to develop efficient parsers so that they can be fast enough to parse large amounts of data while still maintaining decent accuracy. We investigate two techniques to achieve this. The first is a cognitively-inspired method and the second uses a model distillation method. The first technique proved to be utterly dismal, while the second was somewhat of a success. The second line of research presented in this thesis evaluates parsers. This is also done in two ways. We aim to evaluate what causes variation in parsing performance for different algorithms and also different treebanks. This evaluation is grounded in dependency displacements (the directed distance between a dependent and its head) and the subsequent distributions associated with algorithms and the distributions found in treebanks. This work sheds some light on the variation in performance for both different algorithms and different treebanks. And the second part of this area focuses on the utility of part-of-speech tags when used with parsing systems and questions the standard position of assuming that they might help but they certainly won’t hurt.