CO 487
Fundamental Goals of Cryptography
- Confidentiality: Keeping data secret from all not authorized.
- Data Integrity: Ensure data has not been altered by unauthorized means.
- Data Origin Authentication: Corroborating the source of data.
- Non-Repudiation: Prevents entity from denying previous commitments or actions.
Symmetric-key cryptography: The client and server share some secret information k, called a key.
Public-key cryptography: Share an authenticated (but not secret) information.
SSL Protocol
- Server transmits certificate to client.
- Contains server’s identifying information, with RSA public key and signature of certifying authority.
- Certifying authority is trusted.
- Client verifies signature to verify authenticity.
- Client selects random key k and encrypts with server’s RSA public key, transmits ciphertext to server.
- Server decrpyts ciphertext.
Is SSL really secure?
There are many potential security vulnerabilities.
- Crypto is weak (e.g. AES, HMAC, RSA).
- Quantum attacks on underlying cryptography.
- Weak random number generation.
- Fraudulent certificates.
- Mistakes due to human error.
- Software bugs (both inadvertent and malicious).
- Phishing attacks.
- SSL only protects data during transit, does not protect data when it is collected at the server.
Crypto might be strong, but information security is more difficult (cybersecurity).
- Cryptography providers some mathematical tools to assist, but it is a small part of the entire solution.
Symmetric-key Encryption Scheme
- M, plaintext space.
- C, ciphertext space.
- K, key space.
- A family of encryption functions E_k: M \to C, \forall k \in K.
- A family of decryption functions D_k: C \to M, \forall k \in K.
Such that D_k(E_k(m)) = m.
- Alice and Bob agree on a secret key over a secure channel.
- Alice computes c = E_k(m) and sends over the unsecure channel.
- Bob retrieves plaintex by computing m = D_k(c).
Why not always use the secure channel? It is commonly slow, inconvenient.
What does security mean for a SKES?
- What is the adversary’s goal?
- What are the computational powers?
- How does the adversary interact with the two communicating parties?
Security model: Defines the computational abilities of the adversary, and how she interacts.
Basic assumption: Adversary knows everything about the SKES, only missing the key k.
Adversary’s Interaction
- Passive Attacks.
- Ciphertext-only attack. The attacker knows some ciphertext.
- Known-plaintext attack. Knows some plaintext and corresponding ciphertext.
- Active Attacks.
- Chosen-plaintext attack. Adversary chooses plaintext and obtains corresponding ciphertext.
Attacks which are not considered in this course.
- Clandestine attacks. Bribery, blackmail.
- Side-channel attacks. Monitor encryption and decryption equipment. Timing attacks, power analysis, electromagnetic radiation analysis.
Computational Power of Adversary
- Information-theoretic security: Eve has infinite computational resources.
- Complexity-theoretic security: Eve is a polynomial-time Turing machine.
- Computational security: Eve has a specific computational bound.
Adversary’s Goal
- Recover the secret key.
- Systematically recover plaintext from secret text. May not necessarily need the secret key.
- Learn some partial information about the plaintext from the ciphertext.
We want to guard against the strongest attacker going for the weakest goal.
- If the attacker can achieve 1 or 2, the SKES is said to be totally insecure.
- If attacker cannot learn anything, it is said to be semantically secure.
A SKES is said to be secure if it is semantically secure against a chosen-plaintext attack by a computational bounded adversary.
To break a SKES, the attacker needs to do the following.
- Given a challenge ciphertext c.
- Select plaintexts and obtain corresponding ciphertexts.
- After a feasible amount of computation, obtain information about plaintext m corresponding to the challenge ciphertext c.
Work Factor
- 2^{40} operations very easy.
- 2^{56} operations easy.
- 2^{64} operations feasible.
- 2^{80} operations barely feasible.
- 2^{128} operations infeasible.
The Bitcoin network is presently performing hash operations at the rate of 2^{65} per second, 2^{90} per year.
The Laundeur limit from thermodynamics proposes that exhaustively trying 2^{128} symmetric keys would require more power for a year than the world produces.
Security Level
A cryptographic scheme is said to have a security level of l bits if the fastest known attack on the scheme takes approximately 2^l operations. As of 2019, a security level of 128 bits is desirable in practice.
Polyalphabtic Ciphers
Basic Idea. Use several permutations, plaintext encrypted to one of several possible cipher letters.
Example: Vigenere Cipher.
- Security key is an English word having no repeated letters.
- Totally insecure against a chosen-plaintext attack, you can send single letters at a time.
The One-Time Pad
- Key is a random string of letters as long as the plaintext.
- Add the key character to the plaintext character modulo 26.
If the key is shorter than the ciphertext, the key should not be re-used.
- If c_1 = m_1 + k and c_2 = m_2 + k then c_1 - c_2 = m_1 - m_2.
So c_1 - c_2 depends only on the plaintext and hence can leak information about the plaintext.
- Perfect secrecy. One-time pad is semantically secure against ciphertext-only attacks by an adversary with infinite computational resources.
- Formally proven using concepts from information theory.
It is not useful unless the key is at least as long as the plaintext, so we should use a stream cipher.
Stream Ciphers
Use a pseudorandom generator PRBG (Pseudo-Random Bit Generator).
- The seed is the secret key shared between communicating parties.
- Security depends on the quality of the PRBG.
- The keystream should be indistinguishable from a random sequence (indistinguishability requirement).
- If an adversary knows a portion of the ciphertext and corresponding plaintext, they can easily find the corresponding portion of the keystream. Thus, given portions of the keystream, it should be infeasible to learn any information about the rest of the keystream (unpredictability requirement).
RC4 Stream Cipher
- Very simple, fast, variable key lengths. No catastrophic weaknesses have been found.
- Two components, key scheduling algorithm, keystream generator.
RC4 Key Scheduling Algorithm
K[i], \overline{K}[i], S[i] are bytes.
Input: Secret key K[0], K[1], ..., K[d-1].
Output: 256-long array: S[0], S[1], ..., S[255].
for i in range(256):
S[i] = i
K2[i] = K[i % d]
j = 0
for i in range(256):
j = (K2[i] + S[i] + j) % 256
S[i], S[j] = S[j], S[i]
Idea: S is a “random-looking” permutation of \{0, 1, ..., 255\} which is generated from the secret key.
RC4 Keystream Generator
Input: 256-long byte array S[0], S[1], ..., S[255].
Output: Keystream.
i = 0, j = 0
while keystream_bytes_are_required:
i = (i + 1) % 256
j = (S[i] + j) % 256
S[i], S[j] = S[j], S[i]
t = (S[i] + S[j]) % 256
output(S[t])
Encryption: Keystream bytes are xored with the plaintext bytes.
Fluhrer-Mantin-Shamir attack exploits known biases in the first few bytes of the keystream. Defeated by discarding the first few bytes.
Because of several new weaknesses, RC4 is no longer used in applications such as SSL. Instead, use AES-CTR.
Wireless Security
More attack opportunities, no physical access. Attackers can be at a distance and leave no physical evidence of attack.
- IEEE 802.11 standard includes protocol called Wired Equivalent Privacy.
- WEP’s goal is (only) to protect link-level data during wireless transmission between mobile stations and access points.
- Confidentiality: Prevent against casual eavesdropping.
- Data Integrity: Prevent tampering with transmitted messages.
- Access Control: Protect access to a wireless network infrastructure.
- Discard all packets that are not properly encrypted.
Mobile stations share a key k with access points. Messages are divided into packets of some fixed length. WEP uses a per-packet 24-bit initialization vector (IV) v to process each packet. v is prepended to the key to generate the keystream.
Question: Are confidentiality, data integrity, and access control achieved?
Since there are only 2^{24} choices for IV, collisions are guaranteed after enough time. If they are selected randomly, then one can expect a collision after 2^{12} packets. This allows attackers to obtain multiple messages with the same v, so they can obtain m_1 \oplus m_2.
It is also easy to make controlled changes to encrypted packets without invalidating the checksum.
- Suppose (v, c) is an encrypted packet.
- Let c = RC4(v || k) \oplus (m||S), where k, m, S are unknown.
- Let m^\prime = m \oplus \Delta where \Delta is a bit string where 1s correspond to bits of m an attacker wishes to change.
- c^\prime = c \oplus (\Delta || CRC(\Delta)). Then (v, c^\prime) is a valid encrypted packet.
Knowledge of one plaintext m corresponding to (v, c) allows an attacker to compute a valid encrypted packet for any plaintext with the same keystream.
Lessons Learned
- The devil is in the details. Do not assume that obvious ways of using cryptographic functions are secure.
- Attacks only get better over time.
- Moore’s law.
- Known attacks are constantly being tweaked and improved.
- New attacks are constantly being invented.
- Designing security is hard.
- There is a big demand in industry for security engineers.
Block Ciphers
- SKES that breaks up plaintext into blocks of fixed length and encrypts one block at a time.
- Stream cipher encrypts one character at a time.
Desirable Properties
- Security.
- Diffusion. Each ciphertext bit should depend on all plaintext bits.
- Confusion. Relationship between key bits and ciphertext bits should be complicated.
- Key length. Should be small but large enough to preclude exhaustive key search.
- Efficiency.
- Simple.
- High encryption and decryption rate.
- Suitability for hardware or software.
Feistel Cipher
Take something fairly simple and repeat.
- Parameters. Half block length n, number of rounds h, key length l.
- Encryption takes place in h rounds
- Plaintext is m = (m_0, m_1) where m_i \in \{0, 1\}^n.
- Round h: (m_{h-1}, m_h) \to (m_h, m_{h+1}) where m_{h+1} = m_{h-1} \oplus f_h(m_h).
- Decryption.
- Compute m_{h-1} = m_{h+1} \oplus f_h(m_h).
No restrictions are needed to be placed on f for the encryption to work.
- Implementation.
- Encryption only needs to be implemented for one round.
- Decrpytion can reuse encryption code.
New Data Seal (NDS)
- n = 64, h = 16.
- k: \{0, 1\}^8 \to \{0, 1\}^8.
- Total number of keys is 256^{256} = 2^{2048} so exhaustive key search is infeasible.
- Every subkey is k itself. Each round of encryption uses the same key.
Component function f: \{0, 1\}^{64} \to \{0, 1\}^{64} is complicated.
- Divide input into bytes, then into nibbles.
- Left bytes are passed into a fixed function.
- Swaps bits based on k.
- Each bit tells you whether to swap bits between nibbles.
- Permute bits in a known way.
- Performed 16 times consecutively.
Complicated, but is it secure?
- An adversary can easily determine the secret key after obtaining the ciphertexts for about 32000 carefully-chosen plaintexts.
- The attack shows that NDS is (totally) insecure.
- Demonstrates the importance of using different subkeys in the rounds of a Feistel cipher.
Chosen-Plaintext Attack on NDS
Given an encryption oracle with respect to Alice’s secret key, we want to find k.
Observation: Let T denote one round of encryption. Let F represent the full encryption function.
T((m_{i-1}, m_i)) = (m_i, m_{i-1}\oplus f(m_i)) F((m_0, m_1)) = (m_16, m_17)
So F = T^{16}. So T(F(m)) = T(T^{16}(m)) = T^{16}(m)T(m) = F(T(m)). \mathbf{F, T} commute.
- We select r \in \{0, 1\}^8. We determine k(r) as follows. Select u = (m_0, m_1) such that.
- The first bit of every byte in m_1 is r.
- Ensures that the input to k is r for the first round of encryption.
- S_0(n_1^\prime) \neq S_1(n_2^\prime) for each 1 \le j \le 8.
- Allows us to detect whether the nibbles were instructed to swap.
- Obtain F(u) = (a, b) from Alice.
Then T(F(u)) = T((a, b)) = (b, ?). Since F, T commute, we have F(T(u)) = (b, ?).
Let’s make a guess for k(r) = t \in \{0, 1\}^8. We apply one round of encryption to u assuming that k(r) = t. Call the result T_t(u). Obtain the encryption of T_t(u) from Alice, F(T_t(u)) = (c, d).
Now, if indeed k(r) = t, then T_t(u) = T(u), so F(T_t(u)) = F(T(u)). So b = c. On the other hand, if k(r) \neq t, then we claim T_t(u) \neq T(u) by the second constraint on the input.
Assuming now that F behaves like a random permutation (heuristic assumption). Then we can assume that F(T_t(u)) is a random bit string of length 128. In particular, the probability that c = b is roughly \frac{1}{2^{64}} which is negligible. So if b \neq c then k(r) \neq t.
Expected Chosen-Plaintexts: 256(1 + 128) \approx 32000.
DES
Component function f_i: \{0, 1\}^{32} \to \{0, 1\}^{32}.
- Weaknesses are small key length and small block length.
Question: How can one construct a more secure block cipher from DES?
Multiple Encryption. Re-encrypt the ciphertext one or more times using independent keys. Hope that this increases effective key length.
Double Encryption
k = (k_1, k_2) \in_R\{0, 1\}^{56}.
Encryption. c = E_{k_2}(E_{k_1}(m)).
Double-DES key length is l=112, so exhaustive key search is infeasible.
E^{-1}_{k_2}(c) = E_{k_1}(m) which leads to a meet-in-the-middle attack.
Block Cipher Modes of Operation
How should we encrypt blocks?
- ECB. The drawback of encrypting one block at a time is that identical plaintext results in identical ciphertexts.
- Not semantically secure against chosen-plaintext attacks.
- CBC. We compute c_i = E_k(m_i \oplus c_{i-1}). Identical plaintext with different IVs result in different ciphertexts.
AES
Widely used today.
Requirements.
- Key lengths 128, 192, 256.
- Block length 128 bits.
- Efficient on both hardware and software platform.
- Available on worldwide, non-exclusive, royalty-free.
Hash Functions
Takes in a message of arbitrary length and outputs fixed length. H is fixed and public.
- Mapping H such that H: \{0, 1\}^{\le L} \to \{0, 1\}^n.
- H is efficiently computed.
- Preimage Resistance.
- Given a hash value, it is computationally infeasible to find any x \in \{0, 1\}^{\le L} such that H(x) = y.
- x is called a preimage of y.
- 2nd Preimage Resistance.
- Collision Resistance.
- Computationally infeasible to find two distinct inputs x, x^\prime \in \{0, 1\}^{\le L} such that H(x) = H(x^\prime).
To show that a hash property is violated, you need to design efficient algorithms for.
- PR.
- Given y \in_R \{0, 1\}^n, find x \in \{0, 1\}^*, H(x) = y.
- 2PR.
- Given x \in \{0, 1\}^*, find x^\prime \in \{0, 1\}^*, x \neq x^\prime, H(x^\prime) = H(x).
- CR.
- Find x, x^\prime \in \{0, 1\}^*, x \neq x^\prime, H(x) = H(x^\prime).
- CR \Rightarrow 2PR.
Relationships Between Properties
- CR \Rightarrow 2PR.
- 2PR \not\Rightarrow CR.
- CR \not\Rightarrow PR. In general. CR \Rightarrow PR for reasonable hash functions.
Proof-of-Work in Blockchains
Decentralized storage system.
- Readable by everyone.
- Writeable by everyone.
- Unchangeable by anyone.
Mining
Given D_1, D_2, ..., D_t, find a 256-bit string s such that \text{SHA256}(D_1, D_2, ..., D_t, s) begins with 74 zeros. So selecting arbitrary s is expected 2^{74}. Then s is the proof of work.