Publication
ACL 2004
Conference paper
Exploiting unannotated corpora for tagging and chunking
Abstract
We present a method that exploits unannotated corpora for compensating the paucity of annotated training data on the chunking and tagging tasks. It collects and compresses feature frequencies from a large unannotated corpus for use by linear classifiers. Experiments on two tasks show that it consistently produes signifiant performane improvements.