Publication
TREC 2005
Conference paper

Thresholding strategies for text classifiers: TREC-2005 biomedical triage task experiments

Abstract

We participated in the triage task of biomedical documents in the TREC genomic track. In this paper we describe the methods we developed for the four triage1subtasks. Logistic regression and support vector machine algorithms were first trained to generate ranked lists of test documents. Then a subset of the test documents was identified as positive instances by selecting the top-k documents of the ranked lists. Deciding on the ideal value for k requires a good thresholding strategy. In this paper we first describe two thresholding strategies based on i) logistic regression and ii) support vector machines. In addition to these methods, we describe a thresholding method that combines the outputs from logistic regression and support vector machine by applying a joint thresholding strategy.

Date

Publication

TREC 2005

Authors

Share