Onur Kuru,
Ozan Arkan Can and
Deniz Yuret.
2016.
CharNER: Character-Level Named Entity Recognition. In
COLING,
December. [
ai.ku]
pdf pdf annote google scholar
COLING 2016 review:
Title: CharNER: Character-Level Named Entity Recognition
Authors: Onur Kuru, Ozan Arkan Can and Deniz Yuret
============================================================================
REVIEWER #1
============================================================================
---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------
Relevance: 5
Originality: 3
Technical correctness / soundness: 3
Readability and clarity: 4
Meaningful comparison: 4
Substance: 3
Impact of ideas: 3
Impact of resources: 3
Overall recommendation: 4
Reviewer Confidence: 4
---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------
The paper proposed character based NER for languages with word segmentation.
The character-based tagging is proposed previously. Their contribution is to
include LSTM models for the character-based tagging settings.
However, the result is fair for the targeted languages.
In my opinion, the method may be promising for the languages without word
segmentation such as Chinese and Japanese, since the word segmentation error
affects the score of NER.
============================================================================
REVIEWER #2
============================================================================
---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------
Relevance: 5
Originality: 3
Technical correctness / soundness: 4
Readability and clarity: 4
Meaningful comparison: 4
Substance: 3
Impact of ideas: 3
Impact of resources: 1
Overall recommendation: 4
Reviewer Confidence: 5
---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------
This paper presents a character-based model for named-entity recognition based
on bidirectional LSTMs. There is very recent research that tries to do
something similar, in NAACL-16 (Lample et al.) and ACL (Ma and Hovy, for
example) with the exact same motivation: remove external resources, such as
gazetteers and use a character-based approach to achieve high results. This
should not invalidate the paper, though.
The main difference with previous research (mentioned above) is that this model
examines a sentence as a sequence of characters and outputs a tag distribution
for each character. They later use transition matrices that only allow tags
consistent with the word.
The results are nice, however they are not the best at all. They are very good
compared to systems that do not use external resources including word
embeddings, however it should be a requirement to report results provided by
the other systems without external resources (see Lample et al. for example)
In Table 6, you present results for Ma and Hovy and Lample et al. and you
include them in the "External" row, as far as I know they only use word
embeddings (if they do)... I think that you should incorporate "word
embeddings" to the caption, otherwise readers might think that they use
gazetteers.
Some missing references: Two EMNLP-15 papers that presented interesting results
for tagging, parsing and language modeling by using character-based embeddings.
- Wang Ling; Chris Dyer; Alan W Black; Isabel Trancoso; Ramon Fermandez; Silvio
Amir; Luis Marujo; Tiago Luis
Finding Function in Form: Compositional Character Models for Open Vocabulary
Word Representation
-Miguel Ballesteros; Chris Dyer; Noah A. Smith
Improved Transition-based Parsing by Modeling Characters instead of Words with
LSTMs
This paper is also worth mentioning, since it also produces an entire character
sequence for sentences: Bhuwan Dhingra; Zhong Zhou; Dylan Fitzpatrick; Michael
Muehl; William Cohen Tweet2Vec: Character-Based Distributed Representations for
Social Media
Minor comment:
Lample et al. do some more than LSTM-CRF, they also presented a shift reduce
algorithm that exploits character-based embeddings.
============================================================================
REVIEWER #3
============================================================================
---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------
Relevance: 5
Originality: 3
Technical correctness / soundness: 4
Readability and clarity: 5
Meaningful comparison: 4
Substance: 4
Impact of ideas: 4
Impact of resources: 1
Overall recommendation: 4
Reviewer Confidence: 3
---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------
Very interesting work.
It clearly shows that a deep bidirectional LSTM architecture combined with a
Viterbi decoder effectively finds language specific features for NER.
Applying the algorithm to Languages written without space characters, e.g.,
Chinese and/or Japanese, may be interesting.
=================================
EMNLP 2016 review:
Title: CharNER: Character-Level Named Entity Recognition
Authors: Onur Kuru, Ozan Arkan Can and Deniz Yuret
Instructions
The author response period has begun. The reviews for your submission are displayed on this page. If you want to respond to the points raised in the reviews, you may do so in the box provided below.
The response should be entered by 17 July, 2016 (11:59pm Pacific Daylight Savings Time, UTC -7h).
Response can be edited multiple times during all the author response period.
Please note: you are not obligated to respond to the reviews.
Review #1
Appropriateness: 5
Clarity: 4
Originality: 3
Soundness / Correctness: 4
Meaningful Comparison: 5
Substance: 4
Impact of Ideas / Results: 4
Impact of Accompanying Software: 3
Impact of Accompanying Dataset / Resource: 1
Recommendation: 3
Reviewer Confidence: 5
Comments
This paper presents a named entity recognizer in which the entire sentence is encoded as a sequence of characters, and a bidirectional LSTM is used to make predictions. Unlike previous (and recent) approaches, such as Lample et al. 2016 that presented character-based representation of words and then an LSTM/bidirectionalLSTM/stack-LSTM on top of that. This model is similar to the tweet2vec model recently accepted at ACL2016, even though it tries to solve a different task. They examine a sentence as a sequence of characters and outputs a tag distribution for each character.This model,as Lample et al., has the potentialities of being language independent and they apply it cross-lingually. The motivation and goals are also similar to Lample et al. (remove external features such as gazzetteers etc, and still achieve high results)
Figure 3 does a great job summarizing the entire paper.
In order to avoid things like J o h n w o r k s P O O O G G G G O they use a decoder as Wang et al. 2015, that applies a transition matrix, at the end they output the entire sequence.
They make a good comparison with related work, but since some of the models are freely available, I'd expect that the authors run them in the languages without public results (such as Arabic or Turkish)
In table 5 you should definitely differentiate between systems that use gazzetteers, and neural models that use pretrained word embeddings. They are not the same thing and how it is presented it might confuse the reader.
This is an interesting paper, but it lacks a bit of novelty given all the previous work that already demonstrated the usefulness of characters and sequential models for NER.
Minor comments: Missing ref (?) in related work.