Luke S Zettlemoyer and
Michael Collins.
2005.
Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In
Proceedings of the Twenty First Conference on Uncertainty in Artificial Intelligence (UAI). cit
214. [
semparse, d=geo, d=jobs, spf, afz]
pdf pdf annote google scholar
(***)
Zettlemoyer and Collins 2005: map sentences to typed lambda
expressions:
which rivers run through states that border the state with the capital austin
(lambda $0:e
(and:<t*,t>
(river:<r,t> $0)
(exists:<<e,t>,t>
(lambda $1:e
(and:<t*,t>
(state:<s,t> $1)
(next_to:<lo,<lo,t>>
$1
(the:<<e,t>,e>
(lambda $2:e
(and:<t*,t>
(state:<s,t> $2)
(capital2:<s,<c,t>> $2 austin_tx:c)))))
(loc:<lo,<lo,t>> $0 $1))))))
In addition to the training set of pairs like above they have two
additional sources of information given by hand:
a. GENLEX rules: to generate candidates for the lexicon, they
produce a cross product of (all substrings of the sentence)
x (all categories that can be produced from the lambda
expression). The second part is done using 10 heuristic rules
that convert parts of the lambda expression to possible
categories e.g. N:(lambda (x) (major x)).
b. Initial lexicon: this includes two types of manually added
entries: domain specific, database derived entities such as (Utah
=> NP:utah) and domain independent entries such as “what”
=> (S/(S\NP))/N:(lambda f (lambda g (lamba x (and (f x) (g x)))).
Both these look like cheating. How sensitive are the results to
these manual inputs? What kinds of words are actually learnt at
the end? Is it possible to learn entries like “what” at all?
Why do we need syntactic categories, can’t we just have lambda
expressions? Should look at how future work handled these.
Geo880 p=.9625 r=.7929 f=.8695
(TM01 p=.8992 r=.7940)
Jobs640 p=.9736 r=.7929 f=.8740
(TM01 p=.9325 r=.7984)