Publication
ICDM 2019
Conference paper

Automated feature enhancement for predictive modeling using external knowledge

View publication

Abstract

Supervised machine learning is the task of learning a function that maps features to a target. The strength of that function or the model depends directly on the features provided to the learning algorithm. Specifically, a crucial means of improving the model quality is to add new predictive features. This is often performed by domain specialists or data scientists. It is a hard and time-consuming task because the domain expert needs to identify data sources for new features, join them, and then select those that actually are relevant to the prediction. We present a new system called KAFE (Knowledge Aided Feature Engineering), an interactive predictive modeling system that automatically utilizes structured knowledge present on the web to perform feature addition to improve the accuracy of predictive models. In this proposal, we describe the key techniques such as feature inference and selection, relevant data indexing, and demonstrate its use through an interactive Jupyter notebook.

Date

01 Nov 2019

Publication

ICDM 2019