Estimating Compilation Time of a Query Optimizer
Abstract
A query optimizer compares alternative plans in its search space to find the best plan for a given query. Depending on the search space and the enumeration algorithm, optimizers vary in their compilation time and the quality of the execution plan they can generate. This paper describes a compilation time estimator that provides a quantified estimate of the optimizer compilation time for a given query. Such an estimator is useful for automatically choosing the right level of optimization in commercial database systems. In addition, compilation time estimates can be quite helpful for mid-query reoptimization, for monitoring the progress of workload analysis tools where a large number queries need to be compiled (but not executed), and for judicious design and tuning of an optimizer. Previous attempts to estimate optimizer compilation complexity used the number of possible binary joins as the metric and overlooked the fact that each join often translates into a different number of join plans because of the presence of "physical" properties. We use the number of plans (instead of joins) to estimate query compilation time, and employ two novel ideas: (1) reusing an optimizer's join enumerator to obtain actual number of joins, but bypassing plan generation to save estimation overhead; (2) maintaining a small number of "interesting" properties to facilitate plan counting. We prototyped our approach in a commercial database system and our experimental results show that we can achieve good compilation time estimates (less than 30% error, on average) for complex real queries, using a small fraction (within 3%) of the actual compilation time.