The key interested question motivated by the Shannon’s theorem: can we do encryption using much shorter key than the length of plaintext?
Pseudorandom generator
They idea is to generate a short (truly) random seed and then use a deterministic algorithm to stretch this seed to a longer sequence that “looks like” truly random. This is impossible under the requirements of correctness and perfect indistinguishability. However, if we relax the second requirement by assuming a reasonable upper bound on the computational power of the adversary (e.g., probabilistic-polynomial time computation), then we have what’s called Pseudorandom Generator (PRG).
Recall: an encryption scheme is perfect indistinguishable if for any plaintexts and and any algorithm , where is a random variable over the space of ciphertexts.
We now give a relaxed version of this definition:
An algorithm that generates a pseudorandom variable is called a Pseudorandom Generator (PRG).
The existence of PRGs has not been proved. Nevertheless, it is believed that PRGs exist.
Proof. Assume for contradiction that is a PRG and . Let construct a distinguisher with non-negligible advantage in distinguishing and . The idea is that given , since , we can find, in polynomial time (by simply guessing), the preimage such that if there is such an . The distinguisher simply returns 1 if exists and 0 otherwise. Then, it follows that which is noticeable and contradict the assumption that is a PRG. ◻
Proof. Let construct as follows: given a uniformly random input , iterating for iterations, where the last bits of the output of each iteration is used as the input of the next iteration:
for : where for all
Output .
Since runs in polynomial time, also runs in polynomial time. What remains is to show that the final output of the above procedure is pseudorandom.
We show the pseudorandom of using the hybrid argument as follows. Let construct hybrids strings where is defined as follows:
The first bits are uniformly random
The last bits are generated by simulating the chain of for iterations, starting with a uniformly random seed.
Thus, follows the same distribution as of the output and is a truly random string. Assume for contradiction that is not pseudorandom. That is, there is a distinguisher and a polynomial such that It follows by the triangle inequality that and therefore, by the pigeonhole principle, there exists such that That is, the distinguisher distinguishes and with non-negligible advantage .
Given the above , we construct a distinguisher that distinguishes and as follows: for an input ,
Construct a string :
The first bits of are uniformly random
The last remaining bits are generated by simulating the chain of for iterations, starting with a uniformly random seed.
Output .
Observe that if is truly random, then has the same distribution as of . On the other hand, if , then has the same distribution as . Since can distinguish and with noticeable advantage , inherits this advantage in distinguishing and , and thus contradicts the fact that is a PRG. ◻
Encryption using PRG
We can now have an encryption scheme with keys shorter than input messages. Let be a PRG, , , and . Define
:
:
output
:
output
Proof. The proof is similar to the previous proof of one-time pad scheme:
Correctness: we have
Computational indistinguishability: we prove the second property of our encryption scheme by a security reduction. Specifically, assume for contradiction that is a PRG and there exist two messages and a distinguisher with non-negligible advantage in distinguishing and , where and are the corresponding random variables under our encryption scheme. That is, there is a polynomial such that
The idea is that given this distinguisher , we construct a distinguisher that has a noticeable advantage in distinguishing and , and thus contradicts the fact that is a PRG. To construct , we first observe that if we use the truly random keys instead of keys generated by , then the scheme is identical to the one-time pad scheme and therefore, any distinguisher will have zero advantage: It follows that which implies that either or Assume w.l.g. that the former is the case. We construct as follows:
(here is an advice string that is independent of )
Output .
Then, we have which contradicts the assumption that is a PRG, and therefore, the computational distinguishability holds. ◻
Pseudorandomness and Next-bit unpredictability
We give an alternative definition of the PRG:
The following theorem shows that the two definitions are indeed equivalent:
Proof. We need to show the following two statements are equivalent:
Pseudorandomness: for any PPT-distinguisher ,
Computationally next-bit unpredictability: for any PPT-predictor and all ,
Pseudorandomness
Computationally next-bit
unpredictable:
Assume for contradiction that
is pseudorandomness but computationally next-bit
predictable. I.e., there is a predictor
, an
, and a polynomial
such that
We use
to construct a distinguisher
that has non-negligible advantage in distinguishing
and
, and thus give a constradiction. Let
be as follows:
If then output 1
Otherwise, output 0.
Clearly, we have On the other hand, Therefore, which contradicts the assumption that is pseudorandom.
(Note: since we only need to show the existence of the distinguisher , it doesn’t matter what is the value of in the above proof (and also in the previous proofs). Just the existence of an is enough.)
Pseudorandomness
Computationally next-bit
unpredictable:
We prove the backward direction using the hybrid
argument.
Assume for contradiction that is computationally next-bit unpredictable but is not pseudorandomness. That is, for any and any PPT-predictor , and there exist a PPT-distinguisher and polynomial such that Let construct hybrids where is defined as:
The first bits are the first bits of
The last bits are uniformly random bits.
We then have is identical to the variable and is uniformly random strings. By the triangle inequality, we have that By the pigeonhole principle, this implies that the must be an such that
Let define a string be identical as excepting that the -th bit is flipped. The following result will be useful:
Proof. Note that for any string , we have by the law of total probability that Moreover, since and are identical excepting their -th bits are flipped, we also have Combining the two equalities concludes the proof. ◻
Next, we use this to construct a predictor that predicts the -th bit of as follows: given the first bits of ,
Construct a string with length :
The first bits of is the the first bits of
The last bits is uniformly random
If then output else output .
We have that where the third identity is due to the fact that is uniformly random, the fourth identity is due to the definition of , the sixth identity is due to claim , and the inequality is from . This inequality contradicts the assumption that is computationally next-bit unpredictable, and thus must be pseudorandom. ◻