Publication
ICEIS 2007
Conference paper

Enterprise information search systems for heterogeneous content repositories

Abstract

In larger enterprises, business documents are typically stored in disparate, autonomous content repositories with various formats. Efficient search and retrieval mechanisms are needed to deal with the heterogeneousness and complexity of this environment. This paper presents a general architecture and two industrial implementations of a service-based information system to perform search in Lotus Notes databases and data sources with Web service interfaces. The first implementation is based on a federated database system that maps the various schemas of the sources into a common interface and aggregates information from their native locations. This implementation offers the advantages of scalability and accessibility to real-time information. The second one is based on a one-index enterprise-scale search engine that crawls, parses and indexes the document contents from the sources. This latter implementation offers the ability of scoring the relevance ranking of documents and eliminating duplications in search results. The relative merits and limitations of both implementations will be presented.

Date

Publication

ICEIS 2007

Authors

Share