Detecting Paraphrases for Portuguese using Word and Sentence Embeddings

Authors

DOI:

https://doi.org/10.21814/lm.10.2.286

Keywords:

Paraphrase Identification, Semantic Textual Similarity, Sentence Embeddings

Abstract

Paraphrase detection/identification is the task of determining whether two or more sentences of arbitrary length possess the same meaning. Methods to solve this task have many potential applications in Natural Language Processing systems. This work investigates the combination of different methods of sentence representation in a vector space model of language and linear classifiers to the problem of paraphrase identification for the Portuguese language. The results obtained in this work are inferior to those obtained for the related task of recognizing textual entailment in the ASSIN evaluation for the Portuguese language, but we point out that in this work we investigate the application of sentence embeddings to the problem of paraphrase detection, as such other features usually explored in systems for this task may be trivially incorporated into our method to improve performance.

References

Published

2019-01-24

Issue

Section

POP - By Other Words

How to Cite

Detecting Paraphrases for Portuguese using Word and Sentence Embeddings. (2019). Linguamática, 10(2), 31-44. https://doi.org/10.21814/lm.10.2.286