About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICWSM 2021
Workshop paper
The Role of Data-Driven Discovery in Detecting Vulnerable Sub-populations
Abstract
Disciplined, data-driven discovery has an important role for identifying vulnerable populations. We summarise three recent projects that applied techniques from anomalous pattern detection in order to automatically identify sub-populations that had higher (or lower) rates of outcomes such as child mortality. This type of exploratory analysis can be viewed as complementing human-driven confirmation analysis. Scanning for vulnerable sub-populations that have anomalously high (or low) outcomes can be done directly on the data as a form of stratification. Alternatively, black-box prediction models can be scanned for predictive bias where the observed outcomes of a sub-population are much higher than predicted. In either form, subset scanning is a tool for better understanding data at a sub-population level rather than at aggregate or individual levels.