About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
CODS-COMAD 2024
Conference paper
Tabular Data Synthesis with GANs for Adaptive AI Models
Abstract
In situations such as demographics change ML models often perform poorly because the training data does not appropriately represent the environment. Privacy concerns worsen the issue by severely limiting training data. In this paper, we present a framework that utilizes a GAN-based synthesizer to generate synthetic data that not only satisfies user-defined constraints expressed as marginal distributions of selected columns but also strives to preserve the distributions observed in the original data. This framework takes as input an original dataset and a set of user-defined constraints, and synthesizes data that adheres to these constraints while capturing the underlying distributions present in the given data. The result is a customizable and realistic data generation solution that balances constraint satisfaction and preservation of data distributions.We validate and demonstrate the effectiveness of our technique through experimentation.