About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SIGMOD 2024
Short paper
Comquest: Large Scale User Comment Crawling and Integration
Abstract
User-generated content like comments are valuable sources for various downstream applications. However, access to user comments data is often limited to specific platforms or outlets, which imposes a great limitation on the available data, and may not provide a representative sample of opinions from a diverse population on a particular event. This paper presents a comment crawling system that leverages the Web API of popular third-party commenting systems to collect comments from a large number of websites integrated with the commenting systems. Given a target page, the crawling system utilizes a deep learning model to extract API parameters and send HTTP requests to the API to retrieve comments. The system, Comquest, that we propose to demo is news-oriented and crawls comments regarding specific news topics/stories. Comquest can work with any website that allows commenting. Comquest provides a useful tool for collecting comments that represent a wider range of opinions, stances, and sentiments from websites on a global scale.