#solid state drive
I'm glad I looked at the SMART for my new NVMe disk.
$ sudo smartctl -a -l devstat /dev/nvme0n1
…
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 27 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 103,545 [53.0 GB]
Data Units Written: 11,292,659 [5.78 TB]
Host Read Commands: 1,410,441
Host Write Commands: 152,110,199
Controller Busy Time: 1,266
Power Cycles: 6
Power On Hours: 470
Unsafe Shutdowns: 5
Media and Data Integrity Errors: 0
Error Information Log Entries: 11
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 27 Celsius
Temperature Sensor 2: 26 Celsius
Wait... it's written over 5 TB in under 3 weeks?!? What the?!
Doing some back-of-the napkin maths, that's equivalent to a steady 3.4 MB/s over the time the SSD has been powered on. The disk is "rated"1 for 300 TB written over 5 years, so it that rate it would be out of spec in about 2 years 9 months.
Turns out I've run into a bug in microk8swhere the k8s-dqlite process suddenly goes bonkers using up all of the CPU and spewing transaction logs to the disk. It seems that under heavy disk I/O its possible to exhaust the kernel's async resources which causes huge latency spikes that just make the issue worse.
The nasty bit was that I wasn't even using MicroK8S for anything. I'd just checked it as part of the installer options.
-
The manufacturer's warranty is 5 years or 300 TB written, which equates to about ~1/3 drive-write (165 GB) per day for a 500 GB drive across 5 years