Automatic categorization of Spanish texts into linguistic registers: a contrastive analysis

Authors

Keywords:

Natural language processing, machine learning, linguistic register

Abstract

Collaborative software such as Recommender Systems can benefit from the automatic classification of texts into linguistic registers. First, the linguistic register provides information about the users' profiles and the context of the recommendation. Second, considering the characteristics of each type of text can help to improve existing natural language processing methods. In this paper we contrast two approaches to register categorization for Spanish. The first approach is focused on morphosintactic patterns and the second one on lexical patterns. For the experimental evaluation we tested 38 machine learning algorithms with a precision higher than 89%.

Author Biographies

  • John Roberto Rodríguez, University of Barcelona

    Becario predoctoral (FI)

    Centre de Llenguatge i Computació (CLiC)

    Departamento de Lingüística

    Universidad de Barcelona

  • Maria Salamó Llorente, University of Barcelona

    Profesora del Departamento de Matemática Aplicada y Análisis

    Universidad de Barcelona

  • Maria Antònia Martí Antonín, University of Barcelona

    Directora del Departament de Lingüística General

    Directora de CLiC, Centre de Llenguatge i Computació

    Universidad de Barcelona

References

Published

2013-07-20

Issue

Section

Research Articles

How to Cite

Automatic categorization of Spanish texts into linguistic registers: a contrastive analysis. (2013). Linguamática, 5(1), 59-67. https://www.linguamatica.com/index.php/linguamatica/article/view/153