On the complexity of random satisfiability problems with planted solutions
The problem of identifying a planted assignment given a random k-SAT formula consistent with the assignment exhibits a large algorithmic gap: while the planted solution can always be identified given a formula with O(nlogn) clauses, there are distributions over clauses for which the best known efficient algorithms require nk/2 clauses. We propose and study a unified model for planted k-SAT, which captures well-known special cases. An instance is described by a planted assignment σ and a distribution on clauses with k literals. We define its distribution complexity as the largest r for which the distribution is not r-wise independent (1 ≤ r ≤ k for any distribution with a planted assignment). Our main result is an unconditional lower bound, tight up to logarithmic factors, of Ω(nr-2 ) clauses for statistical algorithms [42, 30], matching the known upper bound (which, as we show, can be implemented using a statistical algorithm) . Since known approaches for problems over distributions have statistical analogues (spectral, MCMC, gradient-based, convex optimization etc.), this lower bound provides a rigorous explanation of the observed algorithmic gap. The proof introduces a new general technique for the analysis of statistical algorithms. It also points to a geometric paring phenomenon in the space of all planted assignments. We describe consequences of our lower bounds to Feige's refutation hypothesis  and to lower bounds on general convex programs that solve planted k-SAT. Our bounds also extend to the planted k-CSP model, defined by Goldreich as a candidate for one-way function , and therefore provide concrete evidence for the security of Goldreich's one-way function and the associated pseudorandom generator when used with a sufficiently hard predicate.