# On the statistical properties of Diffie-Hellman distributions

## Abstract

Let p be a large prime such that p - 1 has some large prime factors, and let θ ∈ ℤ*p be an r-th power residue for all small factors of p - 1. The corresponding Diffie-Hellman (DH) distribution is (θx,θy,θxy) where x, y are randomly chosen from ℤ*p. A recently formulated assumption is that given p, θ of the above form it is infeasible to distinguish in reasonable time between DH distribution and triples of numbers chosen randomly from ℤ*p. This assumption, called the DH Indistinguishability (DHI) assumption, turns out to be quite useful and central in cryptography. In an effort to investigate the validity of this assumption, we study some statistical properties of DH distributions. Let θ be an element in ℤ*p with sufficiently high multiplicative order. We show that if one takes a positive (but sufficiently small) proportion of the most significant bits of each of θx,θy,θxy then one obtains a distribution whose statistical distance from uniform is exponentially small. A similar result holds with respect to the least significant bits of (θx,θy,θxy). We also show somewhat weaker bounds with respect to arbitrary subsets of bit-positions. This remarkable property may help gaining assurance in the DHI assumption. Our techniques are mainly number-theoretic. We obtain an upper bound for double exponential sums with the function aθx + bθy + cθxywhich sharpens and generalizes the previous estimates. In particular, our bound implies the following result (for p, θ of the above form). Ranging over all x, y ∈ ℤ*p, the vectors (θx/p, θy/p, θxy/p) are very evenly distributed in the unit cube. In order to make this work accessible to two groups of researchers, cryptographers and number theorists, we have decided to make it as selfcontained as possible. As a result, some parts of it, mainly targetted to one of these groups, may appear obvious to the other. In particular we present some basic notions of the modern cryptography and on the other hand we give a short explanation how exponential sums show up in various questions related to uniform distribution of sequences.