Ekin Akyürek,
Erenay Dayanık and
Deniz Yuret.
2019.
Morphological Analysis Using a Sequence Decoder.
Transactions of the Association for Computational Linguistics, vol
7, pp
567--579,
Sep. [
ai.ku]
url url abstract google scholar
We introduce Morse, a recurrent encoder-decoder model that produces morphological analyses of each word in a sentence. The encoder turns the relevant information about the word and its context into a fixed size vector representation and the decoder generates the sequence of characters for the lemma followed by a sequence of individual morphological features. We show that generating morphological features individually rather than as a combined tag allows the model to handle rare or unseen tags and outperform whole-tag models. In addition, generating morphological features as a sequence rather than e.g. an unordered set allows our model to produce an arbitrary number of features that represent multiple inflectional groups in morphologically complex languages. We obtain state-of-the art results in nine languages of different morphological complexity under low-resource, high-resource and transfer learning settings. We also introduce TrMor2018, a new high accuracy Turkish morphology dataset. Our Morse implementation and the TrMor2018 dataset are available online to support future research.
Ömer Kırnap,
Erenay Dayanık and
Deniz Yuret.
2018.
Tree-Stack LSTM in Transition Based Dependency Parsing. In
Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp
124--132,
Brussels, Belgium,
October.
Association for Computational Linguistics. [
ai.ku]
url abstract google scholar
We introduce tree-stack LSTM to model state of a transition based parser with recurrent neural networks. Tree-stack LSTM does not use any parse tree based or hand-crafted features, yet performs better than models with these features. We also develop new set of embeddings from raw features to enhance the performance. There are 4 main components of this model: stack’s σ-LSTM, buffer’s β-LSTM, actions’ LSTM and tree-RNN. All LSTMs use continuous dense feature vectors (embeddings) as an input. Tree-RNN updates these embeddings based on transitions. We show that our model improves performance with low resource languages compared with its predecessors. We participate in CoNLL 2018 UD Shared Task as the ”KParse” team and ranked 16th in LAS, 15th in BLAS and BLEX metrics, of 27 participants parsing 82 test sets from 57 languages.