Xing Shi,
Kevin Knight and
Deniz Yuret.
2016.
Why Neural Translations are the Right Length. In
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp
2278--2282,
Austin, Texas,
November.
Association for Computational Linguistics. [
ai.ku]
url url annote google scholar
Title: Why Neural Translations are the Right Length
Authors: Xing Shi, Kevin Knight and Deniz Yuret
Instructions
The author response period has begun. The reviews for your submission are displayed on this page. If you want to respond to the points raised in the reviews, you may do so in the box provided below.
The response should be entered by 17 July, 2016 (11:59pm Pacific Daylight Savings Time, UTC -7h).
Response can be edited multiple times during all the author response period.
Please note: you are not obligated to respond to the reviews.
Review #1
Appropriateness: 5
Clarity: 5
Originality: 4
Soundness / Correctness: 5
Meaningful Comparison: 5
Substance: 4
Impact of Ideas / Results: 3
Impact of Accompanying Software: 1
Impact of Accompanying Dataset / Resource: 1
Recommendation: 5
Reviewer Confidence: 4
Comments
This paper convincingly explains why NMT translations are the right length; they show the mechanism (that there are components in the vectors that keep track of length during encoding and decoding) explicitly with a very clear toy example, as well as the tendency in a real-world task. The paper is admirably clearly written and very accessible.
Please keep in mind that some people still print papers out, and stick to a gray scale legible coloring scheme (blue and red are indistinguishable after the gray scale dimensionality reduction has been applied).
s1p1: "covert that vector in a target sentence." -> "convert that vector into a target sentence."
Review #2
Appropriateness: 5
Clarity: 4
Originality: 4
Soundness / Correctness: 4
Meaningful Comparison: 4
Substance: 4
Impact of Ideas / Results: 3
Impact of Accompanying Software: 1
Impact of Accompanying Dataset / Resource: 1
Recommendation: 4
Reviewer Confidence: 3
Comments
This paper investigates the question of how neural MT models manage to produce output in the right length. Starting with a smal toy problem and proceeding to an actual neural MT system, the paper shows that (groups of) specific cells are "dedicated" to encoding sentence length.
The paper is refreshingly different from many other NMT papers I've seen lately, in that it attempts to understand what's going on within the neural model, thus addressing a point of criticism that is occasionally brought forward against neural approaches to MT: that they are black boxes and no-one seems to care about what's going on inside.
The paper is well written and generally easy to understand. What I'm missing most is a good motivation for addressing this question. What do we gain from knowing that neural models explicitly encode length? Also, is this behavior consistent across neural models? What happens if we start model training with different random initializations?
A few minor quibbles:
- The images in Figure 2 are two small. You've got plenty of room left (6 pages max.!) to provide bigger images.
- Figures 4 and 5 should be tables, not figures.
Review #3
Appropriateness: 5
Clarity: 5
Originality: 3
Soundness / Correctness: 5
Meaningful Comparison: 4
Substance: 4
Impact of Ideas / Results: 4
Impact of Accompanying Software: 1
Impact of Accompanying Dataset / Resource: 1
Recommendation: 4
Reviewer Confidence: 4
Comments
One of the issues with NMT is the apparent opacity of the model. It is hard to know what is going in inside the block box. The authors start to peel back the curtain here by investigating the question of how NMT models output target sentences of the right length. By looking at both a toy auto-encoder with 4 units and a regular NMT model, they provide interesting insight and show that the model has one of a small handful of units devoted specifically to the token counting task.
The paper is well written & easy to follow and the insights will be of interest to the EMNLP audience.