send a tag suggestion

which tags should be associated with each other?


why should these tags be associated?

Use the form below to provide more context.

#solid state drive


I'm glad I looked at the SMART for my new NVMe disk.

$ sudo smartctl -a -l devstat /dev/nvme0n1
…
SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        27 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    103,545 [53.0 GB]
Data Units Written:                 11,292,659 [5.78 TB]
Host Read Commands:                 1,410,441
Host Write Commands:                152,110,199
Controller Busy Time:               1,266
Power Cycles:                       6
Power On Hours:                     470
Unsafe Shutdowns:                   5
Media and Data Integrity Errors:    0
Error Information Log Entries:      11
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               27 Celsius
Temperature Sensor 2:               26 Celsius

Wait... it's written over 5 TB in under 3 weeks?!? What the?!

Doing some back-of-the napkin maths, that's equivalent to a steady 3.4 MB/s over the time the SSD has been powered on. The disk is "rated"1 for 300 TB written over 5 years, so it that rate it would be out of spec in about 2 years 9 months.

Turns out I've run into a bug in microk8swhere the k8s-dqlite process suddenly goes bonkers using up all of the CPU and spewing transaction logs to the disk. It seems that under heavy disk I/O its possible to exhaust the kernel's async resources which causes huge latency spikes that just make the issue worse.

The nasty bit was that I wasn't even using MicroK8S for anything. I'd just checked it as part of the installer options.


  1. The manufacturer's warranty is 5 years or 300 TB written, which equates to about ~1/3 drive-write (165 GB) per day for a 500 GB drive across 5 years