The bare necessities: Increasing lexical coverage for multi-word domain terms with less lexical data
We argue that many multi-word domain terms are not (and should not be regarded as) strictly atomic, especially from a parser’s point of view. We introduce the notion of Lexical Kernel Units (LKUs), and discuss some of their essential properties. LKUs are building blocks for lexicalizations of domain concepts, and as such, can be used for compositional derivation of an open-ended set of domain terms. Benefits from such an approach include reduction in size of the domain lexicon, improved coverage for domain terms, and improved accuracy for parsing.