Tiny Bytes: RSA
tldr RSA works by exploiting the fact we can’t easily factor 2 large prime numbers and group theory to make a trapdoor permutation, aka a function that turns x into y but y can’t easily be turned into x without a secret. However, implementing RSA gets tricky because there’s lots of subtle attacks.MathRSA takes advantage of the group Z^*_{n} (multiplicative group of integers modulo n). This is the non-negative integers less than n that have an inverse modulo n. 1 x 1 mod n = 1. 0 x int = 0 so ...
Tiny Bytes: Chilling
Hi, Just chilling tonight. Aiming to finish up chapter tomorrow. Night, Lucas
Tiny Bytes: Quickie
Hi, Did much more writing on RSA. Will finish soon. Bye, Lucas
Tiny Bytes: RSA
tldr RSA works by exploiting the fact we can’t easily factor 2 large prime numbers and group theory to make a trapdoor permutation, aka a function that turns x into y but y can’t easily be turned into x without a secret. However, implementing RSA gets tricky because there’s lots of subtle attacks.MathRSA takes advantage of the group Z^*_{n} (multiplicative group of integers modulo n). This is the non-negative integers less than n that have an inverse modulo n. 1 x 1 mod n = 1. 0 x int = 0 so ...
Tiny Bytes: Chilling
Hi, Just chilling tonight. Aiming to finish up chapter tomorrow. Night, Lucas
Tiny Bytes: Quickie
Hi, Did much more writing on RSA. Will finish soon. Bye, Lucas
Subscribe to ldnovak
Subscribe to ldnovak
Share Dialog
Share Dialog
<100 subscribers
<100 subscribers
Hi,
Making a summary of the next chapter. This time it’s about Authenticated Encryption.
Basically, we want to send messages that are encrypted and authenticated. That way people can’t listen in to what the message is AND we know that it was sent from the correct person without modification.
There are several ways to do this.
You can use normal encryption algorithms and validate those messages using MACs. There are a few ways to go about this with subtle differences to their security:
Encrypt and MAC (send C + MAC(P) )
MAC then encrypt (send C = Enc(P||MAC(P) ) )
Encrypt then MAC (send C + MAC(C) )
Encrypt and MAC runs both algorithms on the plaintext and sends the encrypted message with a MAC tag generated from the plaintext. To receive and authenticate the message, the recipient decrypts the ciphertext and checks if the plaintext matches the MAC.This approach has 2 flaws. First, the MAC is computed using the plaintext and is sent in the open. If the MAC used isn’t a PRF it will reveal information about the plaintext (remember MACs technically only need to confirm the sender, aka they can’t be forged. There’s no requirement to perfectly hide the message). Of course, you can use a MAC that is a PRF like HMAC-SHA-256. The second problem is that to authenticate the MAC one needs to decrypt the cipher text. This has the potential to expose fraudulent plaintexts to the receiver, as an attack could modify the encrypted message without anyone knowing.
MAC then encrypt makes a MAC using the plaintext and then sends only an encrypted message the contains to plaintext and the MAC. To authenticate, the receiver decrypts the message and then checks the MAC.This approach doesn’t have the first flaw of Encrypt and MAC as MAC(plaintext) is not exposed. However, it shares the same second flaw.
Encrypt then MAC is the last combination. The sender encrypt the message, makes a MAC of the ciphertext, and then sends both. The receiver then authenticates the ciphertext using the MAC and, if valid, decrypts the message.This approach doesn’t have the same weaknesses of the other methods.
Despite having flaws, all three of these methods are in use. The flaws are more theoretical than weaknesses that have been abused in the wild and require patching.
Another way to send authenticated encryption is to use an Authenticated Cipher. Instead of using 2 functions together, just use 1. AE(K,P) = (C, Tag) and AD(K, C, Tag) = P. TLS 1.3 switched to authenticated ciphers. In general, they’re recommended to use as they make implementing authenticated encryption simpler. However, they are really tricky to build.
You can also modify AE to include associated data, data you want to send in plaintext with authentication (e.g., header information about who the message should be sent to). AEAD(K, P, A) = (C, A, T) and ADAD(K, C, A, T) = (P, A). One wrinkle here, we need to add a nonce. If we repeatedly encrypt a message it should return a different ciphertext (aka unpredictable). Else, we can learn if someone encrypted the same message twice. To prevent this you can add a nonce. AEAD(K, P, A, N) = (C, A, T) and ADAD(K, C, A, T, N) = (P, A).
Security
Authenticated ciphers clearly need to protect the confidentiality and authenticity of the message. This means that if you ignore the auth tag, the cipher is as good as any other encryption scheme. Same if you ignore the cipher, the auth tag has to be as good as any authentication scheme.
Because authenticated ciphers use a nonce, we also can measure their misuse resistance (aka how much info of the plaintext leaks from the ciphertext when a nonce is reused -- another way that uses more specific jargon: how much a repeated nonce will impact the cipher’s authenticity and confidentiality).
Performance
We want the authenticated cipher to be performant. Because these ciphers are complex and require many operations, authenticated ciphers tend to be slow.
There tend to be two types of authenticated ciphers: 1-layer and 2-layer. 1-layer has 1 algorithm that deals with making the ciphertext and the tag. In 2-layer, 2 different algorithms deal with making the ciphertext and tag separately. 2-layers are more complicated and tend to be slower.
Of course, how quickly a single block isn’t the whole story. Different ciphers have different degrees they can be parallelized. If blocks can all be process independently, they can easily be parallelized (CTR cipher). This is not true if block 1 has to happen before block 2 and so on (CBC mode).
There are other important attributes to consider. A streamable or online cipher can process a block of data without storing information about other blocks. Non-streamable usually have to store the entire message so they can make 2 passes over the data. This criteria is especially important when running on low memory machines, such as on a router.
Functional (other)
There’s other functionality besides security and performance to consider. For example, whether the associated data is processed before or after the encrypted data or either. (Ideally it’s better to have either so the use case can dictate). Of course, simplicity of design is another as more features mean more complexity and more chances for bugs.
Another example is whether encryption and decryption use the same underlying algorithm (example of encryption algs: CBC block cipher mode requires 2, AES uses 2 similar algorithms, CTR only requires encryption algorithm). This is important on low-cost hardware (cost in logic gates), as more algorithms mean more gates to implement.
The most widely used authenticated cipher is AES-GCM. The name comes from the fact that it uses AES and GCM (Galois counter mode -- tweak on CTR mode to make authentication tag). It’s a NIST standard (and I’m not sure if any others are).
An interesting fact is that while GCM will work with any block cipher, it’s nearly always used with AES. Both are American. If you don’t trust AES because it’s America, you won’t trust GCM for the same reason.
The tag is generated off of encrypted data.
High Level How it works
CTR the plain text to get cipher texts:
AES_k(nonce || 1) xor P1 = C1, AES_k(nonce || 2) xor P2 = C2, …
XOR the data blocks with GHASH and the cipher texts, length, and Hash key to generate auth tag:
T = GHASH(H, C^) xor AES(K, N|| 0) ^ includes authenticated data
What’s important to point out
Hash key is AES(K, o). This key is used for GHASH(H, C) so if GHASH’s key is compromised, the master key K is still safe.
GHASH uses polynomial notation to multiply ciphertexts. This helps with performance via hardware and software optimizations for polynomial optimization. However, GHASH’s speed is still slow, partially because it isn’t parallelizable. Unlike CTR mode to make the cipher, which is parallelizable and is much faster to make than GHASH’s tag.
GHASH is also very complex and tricky to implement correctly.
GHASH is fragile to nonce repetition. If the same nonce is used twice, H can be forged.
GHASH is streamable.
OCB stands for offset codebook and predates GCM. However, until 2013 you needed a license from its creator Phil Rogaway to use it. Now it can be used for free for nonmilitary software implementations.
The tag is generated off of plaintext data (note there is nothing wrong with this for this approach. It is secure).
How it works
XOR plaintext with an offset, encrypt that value, XOR it with the offset again. Boom, ciphertext.
E(k, P1 xor O1) xor O1 = C1
The tag is similar
E(k, P1 xor P2 … xor O*) = T
with associated data A1, A2
E(K, P1 xor P2 … xor O*) xor E(k, A1 xor O1) xor E(k, A2 xor O2)…
The offsets O are calculated via key and nonce.
OCB is less fragile than GCM for repeated nonces, but it still breaks the authenticity of OCB. An attacker would notice repeated ciphertexts between two messages and create another encrypted message with the same checksum and tags as one of those messages. The would not get the key, unlike GCM.
OCB is marginally faster than GCM. OCB is parallelizable and streamable. OCB and GCM make about the same number of calls the block ciphers, but OCB calls it on plaintext rather than GHASH. This used to be a bigger problem when CPU resources were more limited.
OCM needs both the block ciphers’ encryption and decryption functions to encrypt and decrypt, increasing the hardware footprint. GCM only needs the encryption function.
Synthetic IV or SIV is an auth cipher mode, usually used with AES, that is secure even if you use the same nonce twice. Repeated use will only let others know if the same plaintext was encrypted twice.
SIV’s instructions are more general and tell you how to make it out of a cipher (E) and PRF.
The biggest issue is SIV is not streamable.
You can also build permutation-based ciphers. Instead of building on AES, you can permute the inputs; transforming them into a random looking output that’s a similar size as the original. It can be transformed again back into the original. It is fast, secure, and more resistant to nonce reuse than GCM and OCB (an attacker would only learn whether the 2 messages began with the same value and the length of that prefix).
Basically, you take an initial value, xor it with K || Nonce, permute that, and xor with P1 to get C1. Now you repeatedly permute and xor Pi to get Ci and the tag.
There’s lots of trickiness in implementing the permutation. Don’t implement it yourself.
They are streamable and use a single algorithm for encryption and decryption; however, they aren’t parallelizable.
Because authenticated ciphers need to maintain confidentiality and authenticity with a variety of inputs, key lengths, and associated data, there are lots of ways things can go wrong.
GCM can have a weak Hash key.
H = AES(K, o). For certain values this key is bad. H=1 and H=0 make the xor trivial to understand. If H belongs to a short cycle (H^i = H where i is small), this reduces the strength of the key.
GCM Small Tags
If you use a small tag with a large message, the tags becomes much weaker and easier to forge. (instead of 1/2^n it is 2^blocks of the longest messsage /2^n).
Ayy finally done. Lots of info. This helped me learn.
Bye,
Lucas
Hi,
Making a summary of the next chapter. This time it’s about Authenticated Encryption.
Basically, we want to send messages that are encrypted and authenticated. That way people can’t listen in to what the message is AND we know that it was sent from the correct person without modification.
There are several ways to do this.
You can use normal encryption algorithms and validate those messages using MACs. There are a few ways to go about this with subtle differences to their security:
Encrypt and MAC (send C + MAC(P) )
MAC then encrypt (send C = Enc(P||MAC(P) ) )
Encrypt then MAC (send C + MAC(C) )
Encrypt and MAC runs both algorithms on the plaintext and sends the encrypted message with a MAC tag generated from the plaintext. To receive and authenticate the message, the recipient decrypts the ciphertext and checks if the plaintext matches the MAC.This approach has 2 flaws. First, the MAC is computed using the plaintext and is sent in the open. If the MAC used isn’t a PRF it will reveal information about the plaintext (remember MACs technically only need to confirm the sender, aka they can’t be forged. There’s no requirement to perfectly hide the message). Of course, you can use a MAC that is a PRF like HMAC-SHA-256. The second problem is that to authenticate the MAC one needs to decrypt the cipher text. This has the potential to expose fraudulent plaintexts to the receiver, as an attack could modify the encrypted message without anyone knowing.
MAC then encrypt makes a MAC using the plaintext and then sends only an encrypted message the contains to plaintext and the MAC. To authenticate, the receiver decrypts the message and then checks the MAC.This approach doesn’t have the first flaw of Encrypt and MAC as MAC(plaintext) is not exposed. However, it shares the same second flaw.
Encrypt then MAC is the last combination. The sender encrypt the message, makes a MAC of the ciphertext, and then sends both. The receiver then authenticates the ciphertext using the MAC and, if valid, decrypts the message.This approach doesn’t have the same weaknesses of the other methods.
Despite having flaws, all three of these methods are in use. The flaws are more theoretical than weaknesses that have been abused in the wild and require patching.
Another way to send authenticated encryption is to use an Authenticated Cipher. Instead of using 2 functions together, just use 1. AE(K,P) = (C, Tag) and AD(K, C, Tag) = P. TLS 1.3 switched to authenticated ciphers. In general, they’re recommended to use as they make implementing authenticated encryption simpler. However, they are really tricky to build.
You can also modify AE to include associated data, data you want to send in plaintext with authentication (e.g., header information about who the message should be sent to). AEAD(K, P, A) = (C, A, T) and ADAD(K, C, A, T) = (P, A). One wrinkle here, we need to add a nonce. If we repeatedly encrypt a message it should return a different ciphertext (aka unpredictable). Else, we can learn if someone encrypted the same message twice. To prevent this you can add a nonce. AEAD(K, P, A, N) = (C, A, T) and ADAD(K, C, A, T, N) = (P, A).
Security
Authenticated ciphers clearly need to protect the confidentiality and authenticity of the message. This means that if you ignore the auth tag, the cipher is as good as any other encryption scheme. Same if you ignore the cipher, the auth tag has to be as good as any authentication scheme.
Because authenticated ciphers use a nonce, we also can measure their misuse resistance (aka how much info of the plaintext leaks from the ciphertext when a nonce is reused -- another way that uses more specific jargon: how much a repeated nonce will impact the cipher’s authenticity and confidentiality).
Performance
We want the authenticated cipher to be performant. Because these ciphers are complex and require many operations, authenticated ciphers tend to be slow.
There tend to be two types of authenticated ciphers: 1-layer and 2-layer. 1-layer has 1 algorithm that deals with making the ciphertext and the tag. In 2-layer, 2 different algorithms deal with making the ciphertext and tag separately. 2-layers are more complicated and tend to be slower.
Of course, how quickly a single block isn’t the whole story. Different ciphers have different degrees they can be parallelized. If blocks can all be process independently, they can easily be parallelized (CTR cipher). This is not true if block 1 has to happen before block 2 and so on (CBC mode).
There are other important attributes to consider. A streamable or online cipher can process a block of data without storing information about other blocks. Non-streamable usually have to store the entire message so they can make 2 passes over the data. This criteria is especially important when running on low memory machines, such as on a router.
Functional (other)
There’s other functionality besides security and performance to consider. For example, whether the associated data is processed before or after the encrypted data or either. (Ideally it’s better to have either so the use case can dictate). Of course, simplicity of design is another as more features mean more complexity and more chances for bugs.
Another example is whether encryption and decryption use the same underlying algorithm (example of encryption algs: CBC block cipher mode requires 2, AES uses 2 similar algorithms, CTR only requires encryption algorithm). This is important on low-cost hardware (cost in logic gates), as more algorithms mean more gates to implement.
The most widely used authenticated cipher is AES-GCM. The name comes from the fact that it uses AES and GCM (Galois counter mode -- tweak on CTR mode to make authentication tag). It’s a NIST standard (and I’m not sure if any others are).
An interesting fact is that while GCM will work with any block cipher, it’s nearly always used with AES. Both are American. If you don’t trust AES because it’s America, you won’t trust GCM for the same reason.
The tag is generated off of encrypted data.
High Level How it works
CTR the plain text to get cipher texts:
AES_k(nonce || 1) xor P1 = C1, AES_k(nonce || 2) xor P2 = C2, …
XOR the data blocks with GHASH and the cipher texts, length, and Hash key to generate auth tag:
T = GHASH(H, C^) xor AES(K, N|| 0) ^ includes authenticated data
What’s important to point out
Hash key is AES(K, o). This key is used for GHASH(H, C) so if GHASH’s key is compromised, the master key K is still safe.
GHASH uses polynomial notation to multiply ciphertexts. This helps with performance via hardware and software optimizations for polynomial optimization. However, GHASH’s speed is still slow, partially because it isn’t parallelizable. Unlike CTR mode to make the cipher, which is parallelizable and is much faster to make than GHASH’s tag.
GHASH is also very complex and tricky to implement correctly.
GHASH is fragile to nonce repetition. If the same nonce is used twice, H can be forged.
GHASH is streamable.
OCB stands for offset codebook and predates GCM. However, until 2013 you needed a license from its creator Phil Rogaway to use it. Now it can be used for free for nonmilitary software implementations.
The tag is generated off of plaintext data (note there is nothing wrong with this for this approach. It is secure).
How it works
XOR plaintext with an offset, encrypt that value, XOR it with the offset again. Boom, ciphertext.
E(k, P1 xor O1) xor O1 = C1
The tag is similar
E(k, P1 xor P2 … xor O*) = T
with associated data A1, A2
E(K, P1 xor P2 … xor O*) xor E(k, A1 xor O1) xor E(k, A2 xor O2)…
The offsets O are calculated via key and nonce.
OCB is less fragile than GCM for repeated nonces, but it still breaks the authenticity of OCB. An attacker would notice repeated ciphertexts between two messages and create another encrypted message with the same checksum and tags as one of those messages. The would not get the key, unlike GCM.
OCB is marginally faster than GCM. OCB is parallelizable and streamable. OCB and GCM make about the same number of calls the block ciphers, but OCB calls it on plaintext rather than GHASH. This used to be a bigger problem when CPU resources were more limited.
OCM needs both the block ciphers’ encryption and decryption functions to encrypt and decrypt, increasing the hardware footprint. GCM only needs the encryption function.
Synthetic IV or SIV is an auth cipher mode, usually used with AES, that is secure even if you use the same nonce twice. Repeated use will only let others know if the same plaintext was encrypted twice.
SIV’s instructions are more general and tell you how to make it out of a cipher (E) and PRF.
The biggest issue is SIV is not streamable.
You can also build permutation-based ciphers. Instead of building on AES, you can permute the inputs; transforming them into a random looking output that’s a similar size as the original. It can be transformed again back into the original. It is fast, secure, and more resistant to nonce reuse than GCM and OCB (an attacker would only learn whether the 2 messages began with the same value and the length of that prefix).
Basically, you take an initial value, xor it with K || Nonce, permute that, and xor with P1 to get C1. Now you repeatedly permute and xor Pi to get Ci and the tag.
There’s lots of trickiness in implementing the permutation. Don’t implement it yourself.
They are streamable and use a single algorithm for encryption and decryption; however, they aren’t parallelizable.
Because authenticated ciphers need to maintain confidentiality and authenticity with a variety of inputs, key lengths, and associated data, there are lots of ways things can go wrong.
GCM can have a weak Hash key.
H = AES(K, o). For certain values this key is bad. H=1 and H=0 make the xor trivial to understand. If H belongs to a short cycle (H^i = H where i is small), this reduces the strength of the key.
GCM Small Tags
If you use a small tag with a large message, the tags becomes much weaker and easier to forge. (instead of 1/2^n it is 2^blocks of the longest messsage /2^n).
Ayy finally done. Lots of info. This helped me learn.
Bye,
Lucas
No activity yet