Publication
CIKM 2004
Conference paper

Towards smarter documents

View publication

Abstract

Document analysis research typically focuses on document image understanding or classic problems in text classification, clustering, summarization and discovery. While that is an important aspect of document management, in practice, documents lifecycles are often determined by the context of the business process that they are relevant to. It therefore becomes necessary for the document analysis techniques to recognize and leverage the contextual information provided by a supporting schema and business process. This paper presents an intelligent document management framework with relevant document analysis, metadata extraction, and business process association algorithms and methodology. The architecture supporting this framework seamlessly integrates a runtime environment with an authoring environment by combining relational data modeling tools with document classification techniques. The runtime environment accepts incoming documents, classifies the document, extracts metadata and executes customized business logic. The authoring environment supports the association of a class of documents with a relational document schema, identification of attribute values that must be extracted automatically, generation of relevant business logic, and deployment of authoring artifacts into the runtime architecture. We demonstrate the use of this framework with representative real-world document transformative applications. Copyright 2004 ACM.

Date

Publication

CIKM 2004

Authors

Share