Cryptography 101: The Goldreich-Levin theorem

We give a way to construct a PRG. Consider a function (family)

Note: needs not to be larger than .

Note: this does not means that if then a OWF exists. In fact, this is currently unknown to be true or not.

We can think of the hardcore bit as the source where the hardness of inverting a OWF lies. A hardcore predicate is an efficient function that computes the hardcore bit.

Proof. For any , let define where is a truly random string. That is, maps a random -bit string to a -bit string. We need to show that the output of is pseudorandom.

We show this using the equivalent of pseudorandomness and computationally next-bit unpredictability. Specifically, let . Then it suffices to show that for any and any PPT predictor ,

Note that . Since is a OWP and that is truly random, is also a truly random string, and thus the computationally next-bit unpredictability holds for all . Finally, for , since is a hardcore predicate for and , this is also true that This concludes the proof. ◻

There is no single HCP for all OWFs. However, the following Goldreich-Levin Theorem shows that given any OWF , we can construct a OWF and a HCP for .

Proof. Assume for contradiction that there is a predictor such that where is a noticeable function. We show that is not a OWF.

For the ease of exposition, let write as and where .

Attempt 1: suppose is a “perfect” predictor, i.e., . This implies that Observe that for every , where is the -th unit vector. This means that given any , we can recover the in queries from by using , thus contradicts the fact that is a OWF.

Attempt 2: now suppose : Now, we can not claim that as we did before. The solution for this is randomization. The first observation is that if we sample randomly, then is also a truly random string. This means that and , we have both and The second observation is that by the linearity of the inner product, we have It follows that if is right in both queries with and , then we can recover by The probability of this event is: That is, given , we can recover with probability of 0.6 on each dimension . This probability can be amplified by repeating the process times and outputting the more frequent output. Specifically, for each dimension :

Sample random vectors .
For each vector , compute
Output the more frequent result.

Suppose the success probability on each dimension is (which is 0.6 in above setting). Let be the indicator for the event that the output of the -th test is correct and . We want to bound the probability that majority of the outputs is incorrect, i.e., . We have where the last inequality is due to Chebyshev’s inequality. Note that variables are mutually independent. Therefore, By setting , the failure probability on each dimension is reduced to at most . Then, by union bound, the probability of successfully recovering is at least , which is noticeable and contradicts the fact that is a OWF.

Last attempt: the proof in the last attempt is valid only when , which requires to be strictly larger than . But our goal is for any that is noticeable? How can we remove the assumption about in the last attempt? This is where Goldreich-Levin gets famous!

If we look back to the proof, the only issue is this use of union bound: If then this probability drops below 0.5 that fails the majority vote.

The brilliant idea: as we have argued in the last attempt that and , we have and Now, what if we only need to do one of the two queries for each , say, we already known the true value of and only need to query from the value of ? In this case, we are done with the proof in previous attempt. But how do we know the value of ?

The final observation is that in order to use the Chebyshev’s inequality does not need to be mutually independent (for mutually independent variables, Chernoff’s inequality gives a stronger bound!). Instead, the variables only need to be pairwise independent so that the covariances are zero. The magic is as follows:

Generate random strings .
Generate the set of strings

Then, it can be seen that all these strings are pairwise independent. Moreover, for each string , we have that That is, we only need values of to compute all values of for all . Since , we can try all assignments and still keep the running time in . This concludes the proof. ◻

Cryptography 101: The Goldreich-Levin theorem

Add a comment