Selección de colocaciones académicas en español a través de un filtro de interdisciplinariedad
- Guzzi, Eleonora
- Alonso Ramos, Margarita
ISSN: 1135-5948
Year of publication: 2022
Issue: 69
Pages: 83-94
Type: Article
More publications in: Procesamiento del lenguaje natural
Abstract
In this paper a methodology to compile a list of noun-based academic collocations that feed a lexical tool (Author, 2017) is proposed. To do so, a filter that measures the interdisciplinarity of academic nouns from which collocations are extracted (García-Salido, 2021) is established. This filter is applied to include nouns that are frequent and homogeneously distributed across different academic disciplines, and discard those ascribed to terminology or are more prototypical of general language. Three criteria were used: (1) the IDF (Jones, 1972); (2) an analysis of collocation distributions; (3) a contrast with vocabulary lists of academic English. Results show that these criteria are useful for identifying prototypical nouns of academic discourse and allow for filtering the list of academic collocations. However, the problem regarding how to deal with semantic disambiguation in different disciplines is still present.
Bibliographic References
- Ackermann, K. y Chen, Y.H. 2013. Developing the Academic Collocation List (ACL): A corpus-driven and expert-judged approach. Journal of English for Academic Purposes, 12(4): 235–247.
- Ahumada, I., Zamorano, J. P., García, E. D. R. y Lara, I. A. 2011. Design and development of Iberia: a corpus of scientific Spanish. Corpora, 6(2): 145-158.
- Alonso-Ramos, M., García-Salido, M. y García, M. 2017. Exploiting a Corpus to Compile a Lexical Resource for Academic Writing: Spanish Lexical Combinations. En I. Kosem, J. Kallas, C. Tiberius, S. Krek, M. Jakubíček y V. Baisa (Eds.), Proceedings of eLex 2017 conference, páginas 571-586. Leiden, the Netherlands. Cambridge Dictionary. Consultado el 27 de marzo de 2022 en: https://dictionary.cambridge.org/us/dicti onary/.
- Cobb, T., y Horst, M. 2004. Is there room for an academic word list in French?. En P. Bogaards y B. Laufer (Eds.), Vocabulary in a Second Language. Selection, acquisition, and testing, páginas 13-38, John Benjamins (Amsterdam/Philadelphia).
- Coxhead, A. 2000. A new academic word list. TESOL Quarterly, 34(2): 213–238. Drouin, P. 2007. Identification automatique du lexique scientifique transdisciplinaire. Revue française de linguistique appliquée, 7(2): 45- 64.
- Frankenberg-Garcia, A., Lew, R., Roberts, J. C., Rees, G. P., y Sharma, N. 2019. Developing a writing assistant to help EAP writers with collocations in real time. ReCALL, 31(1): 23- 39.
- García-Salido, M. 2021. Compiling an Academic Vocabulary List of Spanish. Disponible en: https://doi.org/10.13140/RG.2.2.27681.3312 3.
- Garcia, M. y Gamallo, P. 2016. Yet another suite of multilingual NLP tools. En J. P. Leal J. L. SierraRodríguez et al. (Eds.), Languages, Applications and Technologies. Communications in Computer and Information Science, páginas 65–75, Springer (Cham).
- Gardner, D., y Davies, M. 2013. A new academic vocabulary list. Applied Linguistics, 35(3): 305–327.
- Gilquin, G., Granger, S., y Paquot, M. 2007. Learner corpora: The missing link in EAP pedagogy. Journal of English for Academic Purposes, 6(4): 319-335.
- Gries, S. T. 2008. Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics, 13(4): 403–437.
- Hatier, S., Augustyn, M., Tran, T. T. H., Yan, R., Tutin, A., y Jacques, M. P. 2016. French cross-disciplinary scientific lexicon: extraction and linguistic analysis. En Proceedings of EURALEX, páginas 355- 366, Ivane Javakhishvili Tbilisi State University (Tbilsi). Eleonora Guzzi, Marg
- Hyland, K. y Tse, P. 2007. Is there an “academic vocabulary”?. TESOL quarterly, 41(2): 235- 253.
- Hyland, K. 2008. As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes, 27(1): 4–21.
- Jones, K. S. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of documentation, 28(1): 11-21.
- Kilgarriff, A. y Renau, I. 2013. esTenTen, a vast web corpus of Peninsular and American Spanish. Procedia-Social and Behavioral Sciences, 95: 12-19.
- Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J. y Suchomel, V. 2014. The Sketch Engine: ten years on. Lexicography, 1(1): 7–36.
- Lei, L., y Liu, D. 2018. The academic English collocation list: A corpus-driven study. International Journal of Corpus Linguistics, 23(2): 216-243.
- LINGUEE. Consultado el 28 de marzo de 2022 en: http://www.linguee.es.
- Melčuk, I. 2012. Phraseology in the language, in the dictionary, and in the computer. Yearbook of phraseology, 3(1): 31-56.
- Nivre, J., Marneffe, M.-C. D., Ginter, F., Goldberg, Y., Manning, C. D., Mcdonald, R., Petrov, S., Pyysalo, S., Silveira, N., Tsarfaty, R. y Zeman, D. 2016. Universal Dependencies v1: A Multilingual Treebank Collection. En Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), páginas 1659–1666, European Language Resources Association (ELRA).
- Oxford English Dictionary. Consultado el 27 de marzo de 2022 en: y https://www.oed.com/.
- Padró, L. y Stanilovsky, E. 2012. Freeling 3.0: Towards wider multilinguality. En N. Calzolari et al., (Eds.), Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC2012), páginas 2473–2479, European Language Resources Association (ELRA).
- Paquot, M. 2007. Towards a productivelyoriented academic word list. En J. Walinski, K. Kredens, y S. Gozdz Roszkowski (Eds.), Practical Applications in Language and Computers 2005, páginas 127–140. Peter Lang (Frankfurt am main).
- Paquot, M., y Bestgen, Y. 2009. Distinctive words in academic writing: A comparison of three statistical tests for keyword extraction. Language and Computers, 68(1): 247 269.
- Paquot, M. 2012. The LEAD dictionary-cumwriting aid: An integrated dictionary and corpus tool. En S. Granger y M. Paquot (Eds.), Eletronic lexicography, páginas 161- 186, Oxford University Press (Oxford).
- Real Academia Española: Diccionario de la lengua española, 23.ª ed., (versión 23.5 en línea). Consultado el 25 de marzo de 2022 en: https://dle.rae.es.
- Sebastián-Gallés, N., Martí Antonín, M.A., Carreiras Valiña, M. F., y Cuetos Vega, F. 2000. LEXESP: Léxico informatizado del español. Barcelona: Edicions de la Universitat de Barcelona.
- Straka, M., Hajic, J. y Straková, J. 2016. Udpipe: Trainable pipeline for processing conll-u files performing tokenization, morphological analysis, pos tagging and parsing. En Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), páginas 1659– 1666, European Language Resources Association (ELRA).
- Tutin, A. 2007a. Autour du lexique et de la phraséologie des écrits scientifiques. Revue française de linguistique appliquée, 12(2), 5- 14.