Publication
Journal of the American Society for Information Science and Technology
Paper

Visualizing document classification: A search aid for the digital library

View publication

Abstract

The recent explosion of the Internet and the World Wide Web has made digital libraries popular. Easy access to a digital library is provided by commercially available Web browsers, which provide a user-friendly interface. To retrieve documents of interest, the user is provided with a search interface that may only consist of one input field and one push button. Most users type in a single keyword, click the button, and hope for the best. The result of a query using this kind of search interface can consist of a large unordered set of documents, or a ranked list of documents based on the frequency of the keywords. Both lists can contain articles unrelated to the user's inquiry unless a sophisticated search was performed and the user knows exactly what to look for. More sophisticated algorithms for ranking the search results according to how well they meet the users' needs as expressed in the search input may help. However, what is desperately needed are software tools that can analyze the search result and manipulate large hierarchies of data graphically. In this article we describe the design of a language-independent document classification system being developed to help users of the Florida Center for Library Automation analyze search query results. Easy access through the Web is provided, as well as a graphical user interface to display the classification results. We also describe the use of this system to retrieve and analyze sets of documents from public Web sites.