Exploring the effectiveness of neural language models for identification and classification of lexical collocations

Authors

  • Radovan Milovic Universidad de Santiago de Compostela

DOI:

https://doi.org/10.21814/lm.16.1.428

Keywords:

lexical collocations, lexical functions, neural language models, fine-tuning

Abstract

The majority of research on automated collocation processing has focused on using association measures. However, the focus has been slowly shifting to exploring the effectiveness of neural language models (NLMs). In this paper, we investigate the latter by fine-tuning BERT family models in English, Spanish, and Portuguese using annotated lexical resources with Lexical Functions (LFs). We examine the capabilities of language models for the identification and classification of lexical collocation in both monolingual and multilingual scenarios. The results of the overall performances varied, with f1 scores ranging from 0.30 to 0.51. We conclude that the multilingual model excels in cross-lingual learning by employing a combined training set of all three languages. Moreover, despite possible variability, the results demonstrate improved identification of Lexical Functions with a larger number of instances in the training set. Lastly, we conduct a qualitative analysis to investigate possible patterns of misidentification exhibited by the model.

References

Published

2024-06-27

Issue

Section

New Perspectives

How to Cite

Exploring the effectiveness of neural language models for identification and classification of lexical collocations. (2024). Linguamática, 16(1), 17-28. https://doi.org/10.21814/lm.16.1.428