Blitz: A Preprocessor for Detecting Context-Independent Linguistic Structures
-
Boris Katz, Deniz Yuret, Jimmy Lin, Sue Felshin, Rebecca Schulman,
and Adnan Ilik (1998)
( DOC )
-
Blitz: A Preprocessor for Detecting Context-Independent Linguistic
Structures. In Proceedings of the 5th Pacific Rim International
Conference on Artificial Intelligence.
Abstract:
The flow of natural language is often broken by constructions which
are difficult to analyze with conventional linguistic parsers. To
handle these constructions, which include numbers, dates, addresses,
etc., and, to a lesser extent, proper nouns, natural language systems
typically implement specialized new rules. This leads to a level of
complexity which renders development and maintenance
difficult. Analyzing and tokenizing these constructions with an
independent preprocessor can alleviate the burden on already taxed
systems. Because these constructions have highly regular forms, and
can be largely understood in the absence of context, it is possible to
shift the burden of processing away from the primary parser, and onto
a simpler, faster, non-linguistic preprocessor. This paper describes
Blitz, a hybrid database- and heuristic-based natural language
preprocessor, which has been integrated into the START Natural
Language System in order to demonstrate how non-linguistic
preprocessing can improve parsing. As a result, START's ability to
analyze real-world sentences has improved considerably.