August 7th, 2008

Deniable file systems

Bruce Schneier discusses the concept of ‘deniable file systems’ here. Perhaps all too relevant in today’s era. Read on for my own explanation on how to make one. (Warning: It’s real tricky).

Incidentally, that’s a good example of what I mean with: Security Breaks occur almost always due to a brainfart in protocol design, almost never in algorithms like Rijndael or RSA.


The goal of a deniable file system is simple: Store information that, without the relevant password(s)/keyfiles is completely and utterly invisible, and it is impossible to know that there is even hidden data.

In practice this is a bit convoluted to implement but it IS possible provided you have TONS of disk space.

For this file system, we’ll work with basic blocks (We split up the HDD into a number of blocks and maintain index tables that keep track of what block(s) contain which file(s) - virtually all filesystems work like this one way or another, including FAT and EXT2/3). We’ll also add a simple ubiquitous feature to our file system: Anytime you delete a file, we securely erase it by overwriting the involved sectors with random bits a couple of times instead of just indexing the relevant blocks as being ‘free’ without deleting the actual data.

It’s also very important to realize that encrypted data is indistinguishable from random data provided the key is not known.

Now, let me first explain ‘layer 0′:

Layer 0 is the entire harddrive and is unencrypted. The index table is at the front of the HDD. When the secure drive is ‘initialized’ - this is the only layer there is. It cannot be encrypted either, at least not on the file-system level. Let’s say that we have a 1000 GB harddrive formatted with this layer 0 stuff.

Layer 1: We now write a tool which creates a new virtual harddrive. Let’s say we make it 100GB in size. We allocate the blocks needed to store the 100GB of data for this new ‘virtual harddrive’ out of the blocks of layer0, but strewn randomly across layer 0’s space, instead of all in contiguous blocks. Of course, this new harddrive will need its own index table, so we write it to some random block as well. Anytime a block of data is written to this layer1 drive, we first generate a random key, then encrypt the block with this key, then write the encrypted data to the block, then store the relevant filename and such, plus the random key, in the index. Then we encrypt the entire index, and store the key for this in a keyfile, including the offset where the index can be found. The user will have to keep this key safe.

PROBLEM #1: Layer0 at least knows where the blocks of layer1 are, to avoid overwriting them. Clearly, this knowledge alone means you can’t plausibly deny that there’s nothing there.

SOLUTION #1: We simply don’t store which blocks are occupied by layer1 on layer0. To layer0, we make layer1 blocks indistinguishable from empty blocks, and in that way it’s impossible to figure it out without having the offset to layer1’s index block + the key to decrypt it.

PROBLEM #2: But if layer0 doesn’t know, then it can overwrite data!

SOLUTION #2: There is no true solution, but we can of course make redundant copies of each block on layer1. Let’s say we duplicate each block twice for a total of 3 blocks each on layer0 per block of layer1. Anytime layer1 is ‘unlocked’, we check for blocks which have gotten some of its copies overwritten, and re-establish this 3-block redundancy for all of them. Because losing the index table is equivalent to losing all our data, we duplicate it at least 10 times to reduce the risk that it gets trashed.

PROBLEM #3: Well, that takes loads of disk space. True, and there is absolutely no solution to this whatsoever.

PROBLEM #4: But, we are using the deniable file system software, so even if you can’t identify any ’secret blocks’, the mere fact that I’m using the file system is practically giving away that I have a hidden key with hidden data!

SOLUTION #4: This is true, so you throw some scantily clad females on the layer1 drive, and create a layer2 on the layer1 drive. When pressured you can ‘give up’ the first key, and hope that they are satisified that you were simply trying to hide some smut. If you don’t think this will be plausible, you create a bunch more layers, each layer more incriminating and personal than the previous. At some point you can simply claim that you’ve provided all that there is. This recursion is where the true deniability comes from: Yes, there are obviously hidden virtual drives, but no one knows how many. You can plausibly name any reasonable number and there is no way to doubt you short of torturing or mindreading it out of you or some such.

PROBLEM #5: The indices are duped 10 times, so they all look the same. The presence of 10 equal blocks in space marked as ‘empty’ is a dead give-away of the existence of another layer!

SOLUTION #5: True, so, each index block starts with a 128-bit random number. This number serves as Initial Vector for the encryption process ensuring that each block looks completely different.

PROBLEM #6: You have a secret key AND at least 10 indices pointing out where the index blocks are hidden for EACH layer. You’d have to store this information somewhere, and the existence of this information itself ruins any plausible deniability.

SOLUTION #6: There are some solutions. First of all, the encryption key to the index can be ‘generated’ by the user, by way of some sort of secret passphrase. As long as the user remembers this passphrase, the data is just in his head. Of course, each layer needs its own passphrase or your security is pointless. That leaves the issue of the 10 indices. The computer can always just try to ‘decrypt’ every single block on the entire 1000GB harddrive (I suggest a cup of coffee. A really big one. That’s going to take a while) and find those which look like a valid index block post decryption. This means we can feasibly lose our information from time to time without losing any data - it just costs a little time to ‘refind’ the indices. Now we have some options:

#6A: Store the indices on a USB stick with an auto-destruct feature.
#6B: Let the computer generate some sort of sentence which you can remember, and which can be reversed into the 10 indices (or at least 5 indices or some such. If those 5 get trashed, you can always go for the ’scan-whole-drive’ option).
#6C: You can design the system so that it always puts index keys on, say, blocks whose index is divisible by 25 on the host layer. That reduces the ’search time’ from looking through 1000GB of data to just 40GB which should still take an hour, but that’s much more manageable. Going too far with this might pose a security risk because it also makes it a lot easier for any potential hackers to look for breaks. Still, if this is acceptable you can increase the division factor all the way to, say, a 1000. Reading in and analysing 1 GB of data usually doesn’t take much more than 5 minutes on a fast HDD. Then you can leave out remembering the indices altogether.

PROBLEM #7: Long use of the HDD without unlocking all levels from time to time might lead to data destruction, but unlocking a level is risky - if someone is looking at that moment or if someone has bugged/hacked your computer, you’re toast.

SOLUTION #7: You can always keep a backup around. Of course this has to be a copy of the first drive (so that means buying yet another huge HDD). Another solution is to make a heck of a lot of dupes for the infrequently used layers, and to make sure you never use more than about 20% of any layer’s full capacity. That way your risk of wasting a layer should be low. Then again, wasting one layer automatically means you lose the data on all layers contained inside it - so the most secret layer is also the most easily accidentally corrupted. You’ll just have to live with this.

PROBLEM #8: Just because I logged out of a layer doesn’t mean the memory is fresh - perhaps the key lingers there.

SOLUTION #8: A common security pitfall. One solution is to have the kernel itself be aware of the file system (or allow it to hook into its memory management) allowing it to make certain that no copy of any information remains in memory regarding the layer you just logged out of. While in linux, the file system code is already running inside the kernel itself with all privileges, on microkernel architectures like MACH (used by Apple’s os x amongst other things), this requires some breaking of the structure of the kernel to implement properly. Can’t be helped. It does mean that this kind of thing is very difficult to do as a separate ‘user-space’ program. It really needs to be baked right into the core operating system.

This system works best of your secret data you’d like to plausibly deny is small. Trucking away your whole stash of MP3s is not very viable.

Leave a Response

(Note: if you use a new name from an unknown ip address, your comment won't appear until I approve it. Anti-spam measure only, I don't censor).

Imhotep theme designed by Chris Lin. Proudly powered by Wordpress.
XHTML | CSS | RSS | Comments RSS