Towards a Graded Dictionary of Spanish Collocations

  1. Marcos García Salido 1
  2. Marcos Garcia 1
  3. Margarita Alonso-Ramos 1
  1. 1 Universidade da Coruña
    info

    Universidade da Coruña

    La Coruña, España

    ROR https://ror.org/01qckj285

Book:
Electronic lexicography in the 21st century. Proceedings of the eLex 2019 conference. 1-3 October 2019, Sintra, Portugal
  1. Iztok Kosem (ed. lit.)
  2. Tanara Zingano Kuhn (ed. lit.)
  3. Margarita Correia (ed. lit.)
  4. José Pedro Ferreira (ed. lit.)
  5. Maarten Jansen (ed. lit.)
  6. Isabel Pereira (ed. lit.)
  7. Jelena Kallas (ed. lit.)
  8. Miloš Jakubíček (ed. lit.)
  9. Simon Krek (ed. lit.)
  10. Carole Tiberius (ed. lit.)

Publisher: Lexical Computing

Year of publication: 2019

Pages: 849-864

Congress: eLEX : Electronic lexicography in the 21st century (6. 2019. Sintra)

Type: Conference paper

Abstract

Several recent studies have observed that texts of different quality and written by learners at different proficiency levels also vary in the lexical combinations they contain. Such variation can be operationalized by quantitatively measuring the association between the components of these lexical combinations. In particular, pointwise mutual information (MI) has proved to be a good predictor of proficiency development, as several studies on English learners’ writing have shown. This paper examines whether association measures are also a good predictor for the proficiency level of texts written by learners of Spanish, with a view to using such information for grading lexical combinations in order to include them in a collocation dictionary of Spanish. The study also investigates whether the association measures that correlate with learners’ proficiency level can discriminate between phraseological collocations and non- collocations. Our results show that, whereas the MI of learner texts’ lexical combinations is a better predictor of author proficiency than frequency, the latter performs better in identifying phraseological collocations among the whole set of lexical combinations.