About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
TAGA 2004
Conference paper
Document segmentation with application for the book publishing industry
Abstract
We describe a method of segmenting a scanned page into Text, Image, Line-Art and background. Each segment undergoes specific image processing and compression routines, based on its type, and the document is then reassembled as in the original page. This procedure improves the print quality of the document, being as close as possible to the paper original, and eliminates artifacts that would otherwise result in printing a scanned document. Moreover, the disparate compression algorithms yield a reduced size file, improving performance in printers, servers, and networks.