Mining Documentation to Extract Hyperparameter Schemas
Abstract
AI automation tools need machine-readable hyperparameter schemas to define their search spaces. At the same time, AI libraries often come with good human-readable documentation. While such documentation contains most of the necessary information, it is unfortunately not ready to consume by tools. This paper describes how to automatically mine Python docstrings in AI libraries to extract JSON Schemas for their hyperparameters. We evaluate our approach on 119 transformers and estimators from three different libraries and find that it is effective at extracting machine-readable schemas. Our vision is to reduce the burden to manually create and maintain such schemas for AI automation tools and broaden the reach of automation to larger libraries and richer schemas.