Verifying the UNIPEN devset

Louis Vuurpijl; Ralph Niels; Merijn Van Erp; Lambert Schomaker; Eugene Ratzlaff

doi:10.1109/IWFHR.2004.109

IWFHR 2004

Conference paper

01 Dec 2004

Verifying the UNIPEN devset

View publication

Abstract

This paper describes a semi-automated procedure for the verification of a large human-labeled data set containing online handwriting. A number of classifiers trained on the UNIPEN "trainset" is employed for detecting anomalies in the labels of the UNIPEN "devset". Multiple classifiers with different feature sets are used to increase the robustness of the automated procedure and to ensure that the number of false accepts is kept to a minimum. The rejected samples are manually categorized into four classes: (i) recoverable segmentation errors, (ii) incorrect (recoverable) labels, (iii) well-segmented but ambiguous cases and (iv) unrecoverable segments that should be removed. As a result of the verification procedure, a well-labeled data set is currently being generated, which will be made available to the handwriting recognition community. © 2004 IEEE.

Conference paper

Use of chatroom abbreviations and shorthand symbols in pen computing

William B. Huber, Sung-Hyuk Cha, et al.

IWFHR 2004

View all publications

Abstract

Related

Use of chatroom abbreviations and shorthand symbols in pen computing