Boosting Dependency Parsing Performance by Incorporating Additional Features for Agglutinative Languages

Küçük Resim Yok

Tarih

2022

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

CEUR-WS

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

In recent studies, the use of language models has increased noticeably and has made quite good contributions. However, using the proper representation and taking into account the complementary components are still among the issues to be considered. In this research, the impact of sub-word level sentence piece based word representation on the performance of dependency parsing has been demonstrated for agglutinative languages. Furthermore, we propose to use the sentence representation that holds all meaning of the sentence as an additional feature to improve dependency parsing. Our proposed enhancements are experimented on nine agglutinative languages; Estonian, Finnish, Hungarian, Indonesian, Japanese, Kazakh, Korean, Turkish, and Uyghur. We found that the sentence piece based token encoding has contributed parsing performance for the majority of the experimented languages. Using the entire meaning of the sentence as a complementary feature has enhanced parsing performance for six languages out of nine. © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

Açıklama

2022 International Conference and Workshop on Agglutanative Language Technologies as a Challenge of Natural Language Processing, ALTNLP 2022 -- 7 June 2022 through 8 June 2022 -- Virtual, Online -- 185890

Anahtar Kelimeler

agglutinative languages, dependency parsing, sentence piece, sentence representation

Kaynak

CEUR Workshop Proceedings

WoS Q Değeri

Scopus Q Değeri

N/A

Cilt

3315

Sayı

Künye