Feature optimization for constituent parsing via neural networks
The performance of discriminative constituent parsing relies crucially on feature engineering, and effective features usually have to be carefully selected through a painful manual process. In this paper, we propose to automatically learn a set of effective features via neural networks. Specifically, we build a feedforward neural network model, which takes as input a few primitive units (words, POS tags and certain contextual tokens) from the local context, induces the feature representation in the hidden layer and makes parsing predictions in the output layer. The network simultaneously learns the feature representation and the prediction model parameters using a back propagation algorithm. By pre-Training the model on a large amount of automatically parsed data, and then fine-Tuning on the manually annotated Treebank data, our parser achieves the highest F1 score at 86.6% on Chinese Treebank 5.1, and a competitive F1 score at 90.7% on English Treebank. More importantly, our parser generalizes well on cross-domain test sets, where we significantly outperform Berkeley arser by 3.4 points on average for Chinese and 2.5 points for English.