Pre-processing methods for handwritten Arabic documents

Faisal Farooq; Venu Govindaraju; Michael Perrone

doi:10.1109/ICDAR.2005.191

ICDAR 2005

Conference paper

01 Dec 2005

Pre-processing methods for handwritten Arabic documents

View publication

Abstract

In order to improve the readability and the automatic recognition of handwritten document images, preprocessing steps are imperative. These steps in addition to conventional steps of noise removal and filtering include text normalization such as baseline correction, slant normalization and skew correction. These steps make the feature extraction process more reliable and effective. Recently Arabic handwriting recognition has received some attention from the research community. Due to the unique nature of the script, the conventional methods do not prove to be effective. In our work, we describe an orientation independent technique for baseline detection of Arabic words. In addition to that we describe, in the rest of the paper, our techniques for slant normalization, slope correction, line and word separation in handwritten Arabic documents. We show how the baseline can be exploited for slope and skew correction before proceeding with the steps of line and word separation. © 2005 IEEE.

Conference paper