===== What's this about? ===== Bored of disks delivering incorrect data to the above layers without you noticing? xfs only has as of today metadata-checksums, no data checksums. ZFS and btrfs have data checksums, but other issues: licensing (ZFS) and stability (btrfs). dm-integrity can be used together with luks for encryption. Using Fedora29 for below tests. I learned: * If there is a filesystem ontop and a file is damaged (so the underlying checksums fail for dm-integrity), then by default "cp" only reads the first part of the file, until the corrupted part. If you also want later parts, use ddrescue. * dm-integrity does detect modifications * keep multiple copies of important files, or use raid to protect against bit rot ===== Basic LUKS2/dm-integrity usage ===== Setting up and using dm-integrity on a partition /dev/sda1. ### setup, initial usage cryptsetup luksFormat --type luks2 /dev/sda1 \ --integrity hmac-sha256 --cipher aes-xts-plain64 cryptsetup open /dev/sda1 -d mykey luksint cryptsetup status luksint mkfs.ext4 -m0 /dev/mapper/luksint mount /dev/mapper/luksint /mnt/tmp ### teardown after usage umount /mnt/tmp cryptsetup close luksint ### assembling of already setup device cryptsetup open -d mykey /dev/sda1 luksint mount /dev/mapper/luksint /mnt/tmp ===== What's the behaviour when sectors are corrupt? ===== Ideally, I would like to get informed about a corruption, but would like to be able to copy the full file nontheless. As below test shows, instead only the correct data is read. As soon as the defective data is hit, the copy/reading of the file stops. # generating a key, filling it into a file. [root@電脳 ~]# dd if=/dev/urandom bs=48 count=1|openssl base64 >mykey # create a 128MB file [root@電脳 ~]# dd if=/dev/urandom of=rawfile bs=1M count=128 # run luksformat with integrity options [root@電脳 ~]# cryptsetup luksFormat --type luks2 rawfile -d mykey -q \ --integrity hmac-sha256 --cipher aes-xts-plain64 # open the file for usage, verify status [root@電脳 ~]# cryptsetup open rawfile -d mykey test [root@電脳 ~]# cryptsetup status test # create filesystem, mount [root@電脳 ~]# mkfs.ext4 -m0 /dev/mapper/test [root@電脳 ~]# mount /dev/mapper/test /mnt/tmp # create a file, make copies, check size and compute checksum [root@電脳 ~]# dd if=/dev/urandom of=/mnt/tmp/infile bs=1M count=20 [root@電脳 ~]# for i in {1..4}; do cp -v /mnt/tmp/infile /mnt/tmp/infile$i; done [root@電脳 ~]# md5sum /mnt/tmp/infile* 169195cea436db488391a8dca553d87a /mnt/tmp/infile 169195cea436db488391a8dca553d87a /mnt/tmp/infile1 169195cea436db488391a8dca553d87a /mnt/tmp/infile2 169195cea436db488391a8dca553d87a /mnt/tmp/infile3 91fab724edb4e14b74ef70c4007a8254 /mnt/tmp/infile4 # tear down the setup [root@電脳 ~]# umount /mnt/tmp [root@電脳 ~]# cryptsetup close test # now lets look at the part of the file which we will # overwrite/damage with the next command [root@電脳 ~]# dd if=rawfile bs=1 count=20 skip=$((64*1024*1024)) 2>/dev/null|hexdump -vC 00000000 fd 7d 0a 69 f5 67 7d 9b bf 76 34 5b 50 4d 56 c9 |.}.i.g}..v4[PMV.| 00000010 93 9c 99 e6 |....| 00000014 # we overwrite 1 byte at offset 64MB with random data [root@電脳 ~]# dd if=/dev/urandom of=rawfile \ bs=1 count=1 seek=$((64*1024*1024)) conv=notrunc # ..and check that part again. [root@電脳 ~]# dd if=rawfile bs=1 count=20 skip=$((64*1024*1024)) 2>/dev/null|hexdump -vC 00000000 98 7d 0a 69 f5 67 7d 9b bf 76 34 5b 50 4d 56 c9 |.}.i.g}..v4[PMV.| 00000010 93 9c 99 e6 |....| 00000014 # now get get up the luks volume again, and try to read. # the checksum verification will fail, and dm-integrity # will let us know with an I/O error to the upper layers [root@電脳 ~]# cryptsetup open rawfile -d mykey test [root@電脳 ~]# mount /dev/mapper/test /mnt/tmp [root@電脳 ~]# md5sum /mnt/tmp/infile* 169195cea436db488391a8dca553d87a /mnt/tmp/infile 169195cea436db488391a8dca553d87a /mnt/tmp/infile1 169195cea436db488391a8dca553d87a /mnt/tmp/infile2 md5sum: /mnt/tmp/infile3: Input/output error 91fab724edb4e14b74ef70c4007a8254 /mnt/tmp/infile4 [root@電脳 ~]# # Syslog reports this: [31056.418532] device-mapper: crypt: dm-4: INTEGRITY AEAD ERROR, sector 90128 [31056.421383] device-mapper: crypt: dm-4: INTEGRITY AEAD ERROR, sector 90128 [31056.421451] device-mapper: crypt: dm-4: INTEGRITY AEAD ERROR, sector 90128 [31056.421517] device-mapper: crypt: dm-4: INTEGRITY AEAD ERROR, sector 90128 # Trying to copy the file, we see that we get the # "good" data until the damaged sector: [root@電脳 ~]# cp /mnt/tmp/infile* /tmp cp: error reading '/mnt/tmp/infile3': Input/output error [root@電脳 ~]# ls -al /tmp/infile* -rw-r--r--. 1 root root 20971520 Dec 27 10:04 /tmp/infile -rw-r--r--. 1 root root 20971520 Dec 27 10:04 /tmp/infile1 -rw-r--r--. 1 root root 20971520 Dec 27 10:04 /tmp/infile2 -rw-r--r--. 1 root root 14684160 Dec 27 10:04 /tmp/infile3 -rw-r--r--. 1 root root 13742080 Dec 27 10:04 /tmp/infile4 [root@電脳 ~]# # Now, using ddrescue we get a copy of all readable # data: [root@電脳 ~]# ddrescue /mnt/tmp/infile3 /tmp/infile3b GNU ddrescue 1.23 Press Ctrl-C to interrupt ipos: 65018 kB, non-trimmed: 0 B, current rate: 1347 kB/s opos: 65018 kB, non-scraped: 0 B, average rate: 48812 kB/s non-tried: 0 B, bad-sector: 4096 B, error rate: 4096 B/s rescued: 97624 kB, bad areas: 1, run time: 1s pct rescued: 99.99%, read errors: 9, remaining time: 0s time since last successful read: n/a Finished [root@電脳 ~]# [root@電脳 ~]# ls -al /tmp/infile3b -rw-r--r--. 1 root root 20971520 Dec 27 10:04 /tmp/infile3b [root@電脳 ~]# ===== Initialization ===== Initialization can take a long time, here a 4TB disk: root@nexus:~# cryptsetup luksFormat --type luks2 /dev/sda --integrity hmac-sha256 --cipher aes-xts-plain64 -v WARNING: Device /dev/sda already contains a 'crypto_LUKS' superblock signature. WARNING! ======== This will overwrite data on /dev/sda irrevocably. Are you sure? (Type uppercase yes): YES Enter passphrase for /dev/sda: Verify passphrase: Existing 'crypto_LUKS' superblock signature (offset: 0 bytes) on device /dev/sda will be wiped. Existing 'crypto_LUKS' superblock signature (offset: 16384 bytes) on device /dev/sda will be wiped. Key slot 0 created. Wiping device to initialize integrity checksum. You can interrupt this by pressing CTRL+c (rest of not wiped device will contain invalid checksum). Finished, time 1035:13.252, 3590933 MiB written, speed 57.8 MiB/s Command successful. root@nexus:~# ===== How much overhead impose dm-integrity and XFS? ===== ^ ^ 4TB extern^ 4TB intern^ 480GB partition^ | | luks| luks+integ| luks+integ| |/proc/partitions | 3906967327| 3907018584| 446442520| |/dev/mapper/... | 3906934559| 3677116264| 420103208| | overhead| 32768| 229902320| 26339312| | | (0.0008%)| (5.9%)| (5.9%)| |xfs, usable per df | 3905026876| 3675320800| 419898080| | overhead| 1907683| 1795464| 205128| | | (0.05%)| (0.05%)| (0.05%)| * a typical overhead for luks+dm-integrity: 5.9% * xfs overhead: 0.05% ===== more details ===== # more options PARAMS0="--integrity poly1305 --cipher chacha20-random" PARAMS1="--integrity hmac-sha256 --cipher aes-xts-plain64" PARAMS2="--integrity hmac-sha1" # use no longer.. PARAMS3="--integrity hmac-sha512" PARAMS4="--integrity aead --cipher aegis128-random --key-size 128" # 202002, experimental cryptsetup luksFormat --type luks2 /dev/sda1 $PARAMS * [[https://archive.fosdem.org/2018/schedule/event/cryptsetup/attachments/slides/2506/export/events/attachments/cryptsetup/slides/2506/fosdem18_cryptsetup_aead.pdf|Fosdem2018, Data integrity protection with cryptsetup tools]] * [[https://gist.github.com/MawKKe/caa2bbf7edcc072129d73b61ae7815fb|mdadm raid1 & dm-integrity]] * [[https://www.unixsheikh.com/articles/battle-testing-data-integrity-verification-with-zfs-btrfs-and-mdadm-dm-integrity.html|An attempt to compare ZFS/btrfs/dm-integrity+mdadm+fs features and failure resiliance]] * [[https://www.redhat.com/en/blog/what-bit-rot-and-how-can-i-detect-it-rhel|What is bit rot, and how can I detect it on RHEL?]]