About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SRII 2014
Conference paper
InfoSuggest: A system for automated information gathering: With a real-world case study
Abstract
Departments of many organizations treat the World Wide Web as an important information source. They have a need to keep themselves up-to-date with current information in their domain. Such information gathering is a time consuming process due to overload of available information and there are dedicated teams in many organizations for this task. In this paper, we present Info Suggest, a system for end-to-end information gathering from the web. Info Suggest improves efficiency of such focused information gathering process with the use of machine learning. We employ a semi-supervised document classification method called Transductive Support Vector Machines (TSVMs) for learning user preferences based on example articles provided by them. We also devise a strategy for unlabeled data selection TSVM-Meta that is applicable for an information gathering setting. In the paper, we discuss the system architecture and also present a case study for information gathering for food safety in an environmental health department of a government agency. We conduct experiments and demonstrate that our system results in improving the efficiency by as much as 35% by making it easier to find relevant content. © 2014 IEEE.