SRII 2014
Conference paper

InfoSuggest: A system for automated information gathering: With a real-world case study

View publication


Departments of many organizations treat the World Wide Web as an important information source. They have a need to keep themselves up-to-date with current information in their domain. Such information gathering is a time consuming process due to overload of available information and there are dedicated teams in many organizations for this task. In this paper, we present Info Suggest, a system for end-to-end information gathering from the web. Info Suggest improves efficiency of such focused information gathering process with the use of machine learning. We employ a semi-supervised document classification method called Transductive Support Vector Machines (TSVMs) for learning user preferences based on example articles provided by them. We also devise a strategy for unlabeled data selection TSVM-Meta that is applicable for an information gathering setting. In the paper, we discuss the system architecture and also present a case study for information gathering for food safety in an environmental health department of a government agency. We conduct experiments and demonstrate that our system results in improving the efficiency by as much as 35% by making it easier to find relevant content. © 2014 IEEE.


23 Apr 2014


SRII 2014