About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
CoNLL 2003
Conference paper
A Robust Risk Minimization based Named Entity Recognition System
Abstract
This paper describes a robust linear classification system for Named Entity Recognition. A similar system has been applied to the CoNLL text chunking shared task with state of the art performance. By using different linguistic features, we can easily adapt this system to other token-based linguistic tagging problems. The main focus of the current paper is to investigate the impact of various local linguistic features for named entity recognition on the CoNLL-2003 (Tjong Kim Sang and De Meulder, 2003) shared task data. We show that the system performance can be enhanced significantly with some relative simple token-based features that are available for many languages. Although more sophisticated linguistic features will also be helpful, they provide much less improvement than might be expected.