Encryption and Leaking Information
20 Nov 2022 | 2 minutes to readThings can be encrypted, but encryption can be done many different ways, to varying effectiveness. Some methods can preserve things like filenames and hashes while the only the raw data of the file is encrypted. The files are still “encrypted”, but often metadata the files can still be found. While encryption can secure a message being communicated, it is only one piece of the puzzle. While the content of the message could be considered secure, there is a variety of other information generated that can provide additional bits of information.
The enigma machine was cracked in part because each day the weather was communicated with the same header, opening it to a partial known plaintext attack. The enigma algorithm shuffled on a letter-by-letter basis, meaning repeated strings in the plaintext could be exploited to learn clues about the encryption settings. Modern algorithms usually encrypt data in blocks, which just moves the goal posts from analyzing individual letters to single blocks of data.
A perfect encryption algorithm or method (hopefully) ensures the plaintext is secure, but other information surrounding how the message is communicated can be derived There are two things in particular that create metadata that is not usually considered.
Size #
The size of the message is important, even if the encryption method using padding. Long messages are more likely to contain more text than a short message, and can leak information about the content of the message.
Because of this, all messages sent between two parties should always be padded to the same size, so there is no discernable difference between the ciphertexts of “A OK” and “Danger, run away”. To blend in, messages should be padded to blend in with other messages being sent by the system.
Time #
The time a message is sent can be linked to other events, and leak information. If a message is always sent before a separate event takes place, it’s not too much to expect an attacker to link the two together. Since it’s not possible to control when a message needs to be sent, the trick is to be sending messages constantly.
When a message is actually sent, the cipher text will decode to something legible (padded to a standard size). If the user has no message to send, random garbage is sent instead. This stops an attacker from being able to tell if a real or fake message is being sent.
Further Reading #
Similar ideas on timing, but use an exponential distribution of communication events:
https://web.archive.org/web/20150917091955/https://pond.imperialviolet.org/tech.html
https://www.scriptjunkie.us/2021/09/covert-credit-calculation-communications/