Using network bound disk encryption with Stratis

Photo by iMattSmart on Unsplash

In an environment with many encrypted disks, unlocking them all is a difficult task. Network bound disk encryption (NBDE) helps automate the process of unlocking Stratis volumes. This is a critical requirement in large environments. Stratis version 2.1 added support for encryption, which was introduced in the article “Getting started with Stratis encryption.” Stratis version 2.3 recently introduced support for Network Bound Disk Encryption (NBDE) when using encrypted Stratis pools, which is the topic of this article.

The Stratis website describes Stratis as an “easy to use local storage management for Linux.” The  short video “Managing Storage With Stratis” gives a quick demonstration of the basics. The video was recorded on a Red Hat Enterprise Linux 8 system, however, the concepts shown in the video also apply to Stratis in Fedora Linux.

Prerequisites

This article assumes you are familiar with Stratis, and also Stratis pool encryption. If you aren’t familiar with these topics, refer to this article and the Stratis overview video previously mentioned.

NBDE requires Stratis 2.3 or later. The examples in this article use a pre-release version of Fedora Linux 34. The Fedora Linux 34 final release will include Stratis 2.3.

Overview of network bound disk encryption (NBDE)

One of the main challenges of encrypting storage is having a secure method to unlock the storage again after a system reboot. In large environments, typing in the encryption passphrase manually doesn’t scale well. NBDE addresses this and allows for encrypted storage to be unlocked in an automated manner.

At a high level, NBDE requires a Tang server in the environment. Client systems (using Clevis Pin) can automatically decrypt storage as long as they can establish a network connection to the Tang server. If there is no network connectivity to the Tang server, the storage would have to be decrypted manually.

The idea behind this is that the Tang server would only be available on an internal network, thus if the encrypted device is lost or stolen, it would no longer have access to the internal network to connect to the Tang server, therefore would not be automatically decrypted.

For more information on Tang and Clevis, see the man pages (man tang, man clevis) , the Tang GitHub page, and the Clevis GitHub page.

Setting up the Tang server

This example uses another Fedora Linux system as the Tang server with a hostname of tang-server. Start by installing the tang package:

dnf install tang

Then enable and start the tangd.socket with systemctl:

systemctl enable tangd.socket --now

Tang uses TCP port 80, so you also need to open that in the firewall:

firewall-cmd --add-port=80/tcp --permanent
firewall-cmd --add-port=80/tcp

Finally, run tang-show-keys to display the output signing key thumbprint. You’ll need this later.

# tang-show-keys
l3fZGUCmnvKQF_OA6VZF9jf8z2s

Creating the encrypted Stratis Pool

The previous article on Stratis encryption goes over how to setup an encrypted Stratis pool in detail, so this article won’t cover that in depth.

The first step is capturing a key that will be used to decrypt the Stratis pool. Even when using NBDE, you need to set this, as it can be used to manually unlock the pool in the event that the NBDE server is unreachable. Capture the pool1 key with the following command:

# stratis key set --capture-key pool1key
Enter key data followed by the return key:

Then I’ll create an encrypted Stratis pool (using the pool1key just created) named pool1 using the /dev/vdb device:

# stratis pool create --key-desc pool1key pool1 /dev/vdb

Next, create a filesystem in this Stratis pool named filesystem1, create a mount point, mount the filesystem, and create a testfile in it:

# stratis filesystem create pool1 filesystem1
# mkdir /filesystem1
# mount /dev/stratis/pool1/filesystem1 /filesystem1
# cd /filesystem1
# echo "this is a test file" > testfile

Binding the Stratis pool to the Tang server

At this point, we have the encrypted Stratis pool created, and also have a filesystem created in the pool. The next step is to bind your Stratis pool to the Tang server that you just setup. Do this with the stratis pool bind nbde command.

When you make the Tang binding, you need to pass several parameters to the command:

  • the pool name (in this example, pool1)
  • the key descriptor name (in this example, pool1key)
  • the Tang server name (in this example, http://tang-server)

Recall that on the Tang server, you previously ran tang-show-keys which showed the Tang output signing key thumbprint is l3fZGUCmnvKQF_OA6VZF9jf8z2s. In addition to the previous parameters, you either need to pass this thumbprint with the parameter –thumbprint l3fZGUCmnvKQF_OA6VZF9jf8z2s, or skip the verification of the thumbprint with the –trust-url parameter.

It is more secure to use the –thumbprint parameter. For example:

# stratis pool bind nbde pool1 pool1key http://tang-server --thumbprint l3fZGUCmnvKQF_OA6VZF9jf8z2s

Unlocking the Stratis Pool with NBDE

Next reboot the host, and validate that you can unlock the Stratis pool with NBDE, without requiring the use of the key passphrase. After rebooting the host, the pool is no longer available:

# stratis pool list
Name Total Physical Properties

To unlock the pool using NBDE, run the following command:

# stratis pool unlock clevis

Note that you did not need to use the key passphrase. This command could be automated to run during the system boot up.

At this point, the pool is now available:

# stratis pool list
Name Total Physical Properties
pool1 4.98 GiB / 583.65 MiB / 4.41 GiB ~Ca, Cr

You can mount the filesystem and access the file that was previously created:

# mount /dev/stratis/pool1/filesystem1 /filesystem1/
# cat /filesystem1/testfile
this is a test file

Rotating Tang server keys

Best practices recommend that you periodically rotate the Tang server keys and update the Stratis client servers to use the new Tang keys.

To generate new Tang keys, start by logging in to your Tang server and look at the current status of the /var/db/tang directory. Then, run the tang-show-keys command:

# ls -al /var/db/tang
total 8
drwx------. 1 tang tang 124 Mar 15 15:51 .
drwxr-xr-x. 1 root root 16 Mar 15 15:48 ..
-rw-r--r--. 1 tang tang 361 Mar 15 15:51 hbjJEDXy8G8wynMPqiq8F47nJwo.jwk
-rw-r--r--. 1 tang tang 367 Mar 15 15:51 l3fZGUCmnvKQF_OA6VZF9jf8z2s.jwk
# tang-show-keys
l3fZGUCmnvKQF_OA6VZF9jf8z2s

To generate new keys, run tangd-keygen and point it to the /var/db/tang directory:

# /usr/libexec/tangd-keygen /var/db/tang

If you look at the /var/db/tang directory again, you will see two new files:

# ls -al /var/db/tang
total 16
drwx------. 1 tang tang 248 Mar 22 10:41 .
drwxr-xr-x. 1 root root 16 Mar 15 15:48 ..
-rw-r--r--. 1 tang tang 361 Mar 15 15:51 hbjJEDXy8G8wynMPqiq8F47nJwo.jwk
-rw-r--r--. 1 root root 354 Mar 22 10:41 iyG5HcF01zaPjaGY6L_3WaslJ_E.jwk
-rw-r--r--. 1 root root 349 Mar 22 10:41 jHxerkqARY1Ww_H_8YjQVZ5OHao.jwk
-rw-r--r--. 1 tang tang 367 Mar 15 15:51 l3fZGUCmnvKQF_OA6VZF9jf8z2s.jwk

And if you run tang-show-keys, it will show the keys being advertised by Tang:

# tang-show-keys
l3fZGUCmnvKQF_OA6VZF9jf8z2s
iyG5HcF01zaPjaGY6L_3WaslJ_E

You can prevent the old key (starting with l3fZ) from being advertised by renaming the two original files to be hidden files, starting with a period. With this method, the old key will no longer be advertised, however it will still be usable by any existing clients that haven’t been updated to use the new key. Once all clients have been updated to use the new key, these old key files can be deleted.

# cd /var/db/tang
# mv hbjJEDXy8G8wynMPqiq8F47nJwo.jwk   .hbjJEDXy8G8wynMPqiq8F47nJwo.jwk
# mv l3fZGUCmnvKQF_OA6VZF9jf8z2s.jwk   .l3fZGUCmnvKQF_OA6VZF9jf8z2s.jwk

At this point, if you run tang-show-keys again, only the new key is being advertised by Tang:

# tang-show-keys
iyG5HcF01zaPjaGY6L_3WaslJ_E

Next, switch over to your Stratis system and update it to use the new Tang key. Stratis supports doing this while the filesystem(s) are online.

First, unbind the pool:

# stratis pool unbind pool1

Next, set the key with the original passphrase used when the encrypted pool was created:

# stratis key set --capture-key pool1key
Enter key data followed by the return key:

Finally, bind the pool to the Tang server with the updated key thumbprint:

# stratis pool bind nbde pool1 pool1key http://tang-server --thumbprint iyG5HcF01zaPjaGY6L_3WaslJ_E

The Stratis system is now configured to use the updated Tang key. Once any other client systems using the old Tang key have been updated, the two original key files that were renamed to hidden files in the /var/db/tang directory on the Tang server can be backed up and deleted.

What if the Tang server is unavailable?

Next, shutdown the Tang server to simulate it being unavailable, then reboot the Stratis system.

Again, after the reboot, the Stratis pool is not available:

# stratis pool list
Name Total Physical Properties

If you try to unlock it with NBDE, this fails because the Tang server is unavailable:

# stratis pool unlock clevis
Execution failed:
An iterative command generated one or more errors: The operation 'unlock' on a resource of type pool failed. The following errors occurred:
Partial action "unlock" failed for pool with UUID 4d62f840f2bb4ec9ab53a44b49da3f48: Cryptsetup error: Failed with error: Error: Command failed: cmd: "clevis" "luks" "unlock" "-d" "/dev/vdb" "-n" "stratis-1-private-42142fedcb4c47cea2e2b873c08fcf63-crypt", exit reason: 1 stdout: stderr: /dev/vdb could not be opened.

At this point, without the Tang server being reachable, the only option to unlock the pool is to use the original key passphrase:

# stratis key set --capture-key pool1key
Enter key data followed by the return key:

You can then unlock the pool using the key:

# stratis pool unlock keyring

Next, verify the pool was successfully unlocked:

# stratis pool list
Name Total Physical Properties
pool1 4.98 GiB / 583.65 MiB / 4.41 GiB ~Ca, Cr
For System Administrators Using Software

7 Comments

  1. Mark

    Thanks. Haven’t seen the youtube videos you linked to before.
    The changing the tang server keys seems like a lot of manual effort on all the client servers, so somebody has probably written an ansible playbook do update multiple servers at once :-).

    Is there any advantage to using stratis over brtfs as the later is the default in fedora now and handles pools/snapshots/raid quite well anyway ?. Apart from the obvious that RedHat have “depreciated” brtfs, and removed it entirely from rhel8. As it’s fedoramagazine and all new fedora users will probably use the default install of brtfs the question is what would be the justification for users to also install stratis ?. I think this post is more aimed at enterprise users of redhat systems than fedora users.

    Encryption on a stratis filesystem and the use of a tang server seems a needless overhead. The manual unlocking needed if the tang server is unavailable is probably the last thing anybody needs in the middle of the night when things have broken.
    Having said that I have not investigated using a tang server so assume something so critical can be configured in a replicated way and the clients have a pool of tang servers they can check in with which would be needed in a large environment.

    Personally I just use LUKS encryption on all removable disks and have /etc/crypttab unlock them using keyfiles at boot time. Before anyone mentions it yes of course a major disadvantage of that method is a keyfile is not encrypted and has to be on an unencrypted filesystem (like /boot) in order to be read at boot time it you want the disks automatically online at boot time. But the system will come up even if the network around it is dead.

    And stratis can use luks encrypted devices anyway (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_file_systems/managing-layered-local-storage-with-stratis_managing-file-systems) so its adding an extra layer not really needed ?.
    And that doc also mentions stratis supports md-raid devices; so I guess stratis does not support its own software raid like brtfs and zfs (yet).

    The comment in the post on disks lost or stolen not being able to be unencrypted if they cannot contact the tang server also applies to the simple keyfile method as lost or stolen disks cannot be unencrypted unless the system boot disk containing the keyfiles is also taken :-).

    But the advantages of something like a tang server are obvious in that no keys are in clear text. Except the ‘origional key passphrase’ for unlocking when the tang server is unavailable would have to be written down somewhere for every client server pool.

    However I thank you for the post as while I don’t see any point in moving to stratis it did encourage me start googling the tang server, which looks like it isn’t bound to stratis but will manage native luks volumes as well( https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/security_hardening/configuring-automated-unlocking-of-encrypted-volumes-using-policy-based-decryption_security-hardening ) so NBDE is something worth a look for all those like me than were not aware of it.

    • Brian Smith

      Hi Mark, in my opinion one of the main advantages is Stratis is its ease of use, and that it is built on top of very stable technologies like XFS (Stratis is a control plane, not a filesystem). BRTFS has some very nice features that Stratis lacks, and I’m glad that Fedora offers different options in this area.

      I think the main concern with having the keyfile unencrypted in something like the /boot filesystem is that if the encrypted disks are lost/stolen, it is likely that the disk with the /boot filesystem containing the unencrypted key was lost/stolen as well. Especially if you consider something like a laptop.

      Yes, you are correct that NBDE can be used with LUKS as well. NBDE is easy to get started with and to setup, so hopefully you can give it a try.

      • Mark

        Thanks, you have definately inspired me, even if my focus was on luks encrypted disks.

        Using it on external USB luks encrypted devices is easy after installing clevis-udisk2, but less secure. In a gnome desktop environment the visible change is instead of prompting for the luks key and mounting it tang decrypts it ok then prompts for the gnome logged on user password to bump up to authority to use the mount command for it. So anyone can take a disk from one desktop to their own and use it if they know their own password; so I guess removable media while supported is a bad idea.

        Most of the tutorials on youtube show examples of clevis binding to a named server, such as your example of http://tang-server.
        I assume that in most use cases clevis/tang is only used well after boot or if the server(s) is configured to get its own ip-addresses and dns server lists from a dhcp server when it boots.
        At boot time for servers that have static ip-addresses configured its a bit of a mess, found the answer in redhat documentation, you have to use dracut not just to install the clevis-dracut code but to also configure your server network settings using dracut (and that overrides anything customised in /etc/sysconfig/network-scripts or networkmanager; and selinux does not like a network built by dracut at all)… but fiddling with dracut does allow luks boot disks to be unlocked by a tang server.
        The doc for later viewers is https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/security_guide/sec-policy-based_decryption

        So this reply is not directly related to your post on stratis but specificaly to the reply of NDBE is easy to get started with. As your post was very informative I hope it ranks high in google searches, but playing with clevis/tang/luks it is not easy so just wanted to correct that for the millions of viewers who may hit the page because comments have luks in them.

        Once again thanks for the post Brian. Last week I didn’t know tang/clevis existed; now I am determined to make it work for me, although it may take a while. Nice to read a post that triggers the I must learn this response.

        • Brian Smith

          Hi Mark,
          If it helps, here is a short video I made that covers using NBDE at boot with LUKS:
          https://www.youtube.com/watch?v=y_9_iWNUBug

        • Natxo

          I hope you don’t mind if I chip in.

          There is to my knowledge a more recent piece of official documentation for rhel8: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/security_hardening/configuring-automated-unlocking-of-encrypted-volumes-using-policy-based-decryption_security-hardening

          The probably incorrect assumption for your use case is that I guess (I am not the developer, so take it with a grain of salt) clevis-udisks2 is meant for laptops. Those systems have usually just one user, so this is what one normally wants, it automatically decrypts the disk. If this is not your situation, you are very right to point this and the documentation should warn about this, and one should take measures to not let other users but the ones allowed to decrypt the disk log on to the workstation.

          Dracut is …, special, yes. It’s super flexible, but it comes at a price. It’s not something most people usually require to modify. For fixed ips you can add the plugin omit_dracutmodules+=”ifcfg” to the dracut configuration, and dracut will not try to be helpful and will not overwrite your network configuration. Believe me, it can get messier if the host is multihomed and you need to boot from a specific network interface, or if you have ipv6 in the picture, or if you have several disks in the equation. If you are new to the technology I understand completely it can be overwhelming at first. You do need to be ready to fix stuff if it breaks, that is always fun (so what do you do if you tried to set up a fixed ip address, and someone messes up the dracut configuration with a typo, and the system halts during its boot process – yes, this happens)

          All in all, for the simplest use case (automatically boot in single network with dhcp enabled in a wired network) it works out of the box. In other use cases, it does require some more leg work, but it is worth it if you require the luks encryption and you manage a fleet of linux hosts.

          There are new things for automatically decrypting disks coming up like this one: http://0pointer.net/blog/unlocking-luks2-volumes-with-tpm2-fido2-pkcs11-security-hardware-on-systemd-248.html which could be better suited to your use case. They are not available on rhel 8 I think, maybe in fedora 34 they are. To me being able to decrypt with a yubikey is very helpful and I look forward to using it.

    • Natxo

      In our case we required that systems only can boot if they are in an accept list in firewalls to contact the tang servers. So if systems go to customers they cannot boot (if they are intercepted by a third party, they will not reach our tang servers, so they do not boot). This is what we required, and it’s quite easy to do. Decryption using the key in a tpm chip for instance will have a booting system everytime, even if the system has been hijacked by a third actor, we did not want this.

      You can use several tang servers at the same time, and using the sss pin you can even set policies (like, you need to contact 3 of the 5 tang servers in order to decrypt the luks container).

      using tang + nbde in combination with kickstarts from the foreman/satellite is really simple, you can replace the first luks key (the get out of jail card, so to speak) using automation. In practice, you only need to contact the tang server(s) during the boot process, so it really is stable (having it in production for the last three years, no problems at all).

      The question is, for me, why to encrypt? but rather why not to encrypt? 😉

    • Natxo

      the main advantage of nbde is that it makes luks encryption easy.

      Using automation we save luks keys in hashicorp vault, and rotate those (autogenerated) every now and then. Every system one different key, and a copy of the old key is archived in vault in case something goes wrong. So in fact, you just need to know how to get to hashicorp vault and set the right acls for the key/values (this is something you can easily set up and delegate).

      Very happy with this solution.

Comments are Closed

The opinions expressed on this website are those of each author, not of the author's employer or of Red Hat. Fedora Magazine aspires to publish all content under a Creative Commons license but may not be able to do so in all cases. You are responsible for ensuring that you have the necessary permission to reuse any work on this site. The Fedora logo is a trademark of Red Hat, Inc. Terms and Conditions