IEEE Intelligent Systems

On data-driven analysis of user-generated content

View publication


Data-driven analysis used to derive insights and to characterize user-generated content from IBM's 2007 Innovation Jam is discussed. The two phases of Jam focused on idea creation and transforming the big ideas into actual products, solutions, and partnerships that would benefit business or society. In evaluating the Jam, content analysis is used to detect the emergence of focused discussions and to help experts effectively search for valuable information. It is found that Jam enables the emergence of focused discussions and that such discussions rarely emerge early on in the Jam. An analysis of several alternative clustering algorithms showed that clustering is effective for grouping related threads, and it offers a valuable tool for efficient evaluation of the content by domain experts. The effectiveness of the Jam in facilitating focused discussions is evaluated by measuring the change over time by comparing networks from adjacent time windows.