Tom Kwiatkowski,
Luke Zettlemoyer,
Sharon Goldwater and
Mark Steedman.
2011.
Lexical generalization in CCG grammar induction for semantic parsing. In
Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp
1512--1523.
Association for Computational Linguistics. cit
31. [
semparse, d=geo, d=atis, spf]
pdf annote google scholar
(***)
Build on unification based Kwiatkowski 2010.
Key observation is groups of words show same syntactic/semantic tag variation.
So learn the variation for the whole group, more robust to data sparsity.
i.e. They have discovered that word classes exist :)
This helps generalize the language-independent unification approach to unedited sentences like in atis.
Mentions Clarke10, Liang11, Goldwasser11 as going from sentences to answers without LF.
Mentions Branavan10, Vogel10, Liang09, Poon09, 10 as learning from interactions.
Results: (ubl: Kwiatkowsky10, fubl: Kwiatkowski11)
atis-exact-f1: zc07:.852 ubl:.717 fubl:.828
geo880-f1: zc05:.870 zc07:.888 ubl:.882 fubl:.886
geo250-en: wasp:.829 ubl:.826 fubl:.837
geo250-sp: wasp:.858 ubl:.824 fubl:.857
geo250-jp: wasp:.858 ubl:.831 fubl:.835
geo250-tr: wasp:.781 ubl:.746 fubl:.731
Tom Kwiatkowski,
Luke Zettlemoyer,
Sharon Goldwater and
Mark Steedman.
2010.
Inducing probabilistic CCG grammars from logical form with higher-order unification. In
Proceedings of the 2010 conference on empirical methods in natural language processing, pp
1223--1233.
Association for Computational Linguistics. cit
82. [
spf, semparse, d=geo]
pdf annote google scholar
(****)
Sentence to logical form mapping using CCG and unification (UBL) instead of GenLex.
Geo dataset, four languages, two meaning representations (funql, lambda).
Start with single lex item for each sentence mapping it to LF.
Introduce vertical bar | to ccg which can match / or \. (ZC07 similar?)
Understand the SGD gradient possibly reading CC07.
Starting with single lex item and trying splits look much less ad-hoc than ZC05,07 with Genlex and initial lexicon.
Only the proper noun NPs (e.g. Texas) are in the initial lexicon.
4.1 splitting constraints interesting, can learn them from data?
The split-merge process seem a bit ad-hoc, a more principled Bayesian approach may be possible.
Good related work discussion in Sec 6.
UBL geo880: p=.941 r=.850 f=.893
UBL-s (2pass): p=.885 r=.879 f=.882