Publication
ICTIR 2017
Conference paper

Enhanced probabilistic classify and count methods for multi-label text quantification

View publication

Abstract

In this work we address the problem of Multi-Label Text Quantification. To this end, for a given collection of documents, each was pre-classified with one or more labels by some multi-label classifier, our goal is to find an estimate of the cardinality of each actual label set, as accurate as possible. We present two enhanced Probabilistic Classify and Count (PCC) methods that focus on improving the quantification accuracy by employing another supervised learning phase. Using a real-world multi-label documents dataset, we report on an experimental evaluation that compares the estimated label counts produced by our solution (and several alternatives) to the actual label counts derived from labels assigned by human experts. Our results confirm that, using our solution, the quantification accuracy can be significantly improved.

Date

01 Oct 2017

Publication

ICTIR 2017

Authors

Share