women will literally build a virtual machine orchestration service for github ci before going to therapy

shuppy (@delan) [archived]

waow this one small trick will make your zpool scrub faster!!

in openzfs dsl_scan.c, changing `scn->scn_phys.scn_min_txg` from 0 to 7000000 in dsl_scan_setup_sync zpool status after scrubbing with stock zfs 2.2.4, taking over seven minutes zpool status after scrubbing with patched zfs 2.2.4, taking just over two minutes and issuing reads for only 316GiB of the 970GiB pool

(not really. i’m just patching zfs to reproduce a bug that happens near the end of a scrub)

shuppy (@delan) [archived]

we’ve done it. blazingly fast, zero-cost scrubbing

zpool status reporting “scan: scrub repaired 0B in 00:00:00 with 0 errors on Fri May 17 00:56:15 2024”
shuppy (@delan) [archived]

one small problem. i have no idea what range of txgs i want to scrub

in openzfs dsl_scan.c, adding some dataset name checks in `dsl_scan_visitds` that set one new flag `is_target_or_descendant` if the dataset is cuffs/code (or a snapshot or descendant), or another new flag `is_potential_ancestor` if the dataset is cuffs (or a snapshot or descendant). this gets used later to skip scrubbing and/or recursing the dataset (not pictured) zpool status after scrubbing with this new patch, showing that the scrub finished in 00:01:17 after issuing reads for only 105G of the 971G pool. zfs list in another terminal shows that cuffs/code is only 106G

friendship ended with world’s most scuffed reimplementation of openzfs/zfs#15250. now world’s most scuffed reimplementation of openzfs/zfs#7257 is my best friend

shuppy (@delan) [archived]

waow this one small trick will make your zpool scrub faster!!

in openzfs dsl_scan.c, changing `scn->scn_phys.scn_min_txg` from 0 to 7000000 in dsl_scan_setup_sync zpool status after scrubbing with stock zfs 2.2.4, taking over seven minutes zpool status after scrubbing with patched zfs 2.2.4, taking just over two minutes and issuing reads for only 316GiB of the 970GiB pool

(not really. i’m just patching zfs to reproduce a bug that happens near the end of a scrub)

shuppy (@delan) [archived]

we’ve done it. blazingly fast, zero-cost scrubbing

zpool status reporting “scan: scrub repaired 0B in 00:00:00 with 0 errors on Fri May 17 00:56:15 2024”

waow this one small trick will make your zpool scrub faster!!

in openzfs dsl_scan.c, changing `scn->scn_phys.scn_min_txg` from 0 to 7000000 in dsl_scan_setup_sync zpool status after scrubbing with stock zfs 2.2.4, taking over seven minutes zpool status after scrubbing with patched zfs 2.2.4, taking just over two minutes and issuing reads for only 316GiB of the 970GiB pool

(not really. i’m just patching zfs to reproduce a bug that happens near the end of a scrub)

zpool_expansion time

two new WD80EFPX disks, with four labels resting on top, for these two plus the last two disks i added handwritten labels, partially cut from a page out of a notebook like a flyer for a lost dog, showing that ocean5x0 used to be ocean4x1 those four disks, now with labels printed by a label printer those four disks, now installed in a define 7 xl
  1. two new disks! unlike the WD80EFZZ, the WD80EFPX doesn’t seem to let you set the idle timer (wdidle3, idle3ctl). will need to keep an eye on those load cycle counts

  2. gonna interleave them with the last pair of disks i added, so we don’t end up with two mirror vdevs having both of their disks bought at the same time

  3. started writing my usual “lost dog, responds to ocean” labels by hand, but @ariashark reminded me we have a label printer now, so i redid them (and then redid ocean4x1 again, because it needs to be ocean4x2 now)

  4. installed! define 7 xl now at 14 disks and two ssds :3

--i-am-a-cron-job-fuck-me-up-and-delete-without-asking

screenshot of github commit titled “zfs-sync-snapshots: rename --delete-yes and update sync scripts” diff of the script, showing a “--delete” option accepting values “none”, “old”, “all”, or “this”, a mandatory “dry” or “wet” argument, and a “--delete-yes” option being renamed to “--i-am-a-cron-job-fuck-me-up-and-delete-without-asking”

[sheldon smith voice] shuppyco had multiple safeguards in place that could have prevented a data loss incident, such as a dry-and-wet-run system and a safer deletion interface that prompts the operator to confirm each snapshot slated for deletion.

the csb found that delan, the operator on shift at the time, systematically disabled each of those safeguards, citing the pedestrian and familiar nature of the task at hand. it said, “priming my incremental backups is simple, i’ve done this countless times!”

unfortunately, this time it was not simple.

the csb concludes that shuppyco should (a) make the consequences of disabling key data loss safeguards impossible for operators to miss, (b) design and implement a process safety management system,

oops i wrote my own zfs snapshot thinning

terminal showing shell script “zfs-thin-snapshots” on the left, and the result of running it against dataset “ocean/dump/jupiter/home” on the right, where daily snapshots are kept for a week, weekly snapshots are kept for a month, and monthly snapshots are kept thereafter

daily automated zfs backups

screenshot of discord channel with four messages, each with a log file attached:

“sync jupiter ok”
“sync colo ok”
“sync venus ok”
“sync jupiter failed! @everyone”

just set up daily automated zfs backups for three of my machines, including discord logging and ping on failure :D

(sauce)

one weird trick to speed up your zfs metadata by 10x

venus, my home server, now with an intel 730 240GB in addition to the existing intel 730 480GB, so we can put the special vdev on redundant flash

the trick is adding a “special” vdev 🚄⚡

$ time du -sh /ocean/private
2.5T    /ocean/private
11:16.33 total

(send, add special, recv, remove l2arc, reboot)

$ time du -sh /ocean/private
2.5T    /ocean/private
1:17.16 total

https://forum.level1techs.com/t/zfs-metadata-special-device-z/159954