It seems I have it worse than almost anyone. 2.5 month old, 1 TB, M1 MBA: SMART/...

therouwboat · on Feb 24, 2021

I have almost 3 months old 1TB KINGSTON SA2000M81000G as linux root and data written includes transfering 400GB of old stuff to this drive.

  SMART/Health Information (NVMe Log 0x02)
  Critical Warning:                   0x00
  Temperature:                        44 Celsius
  Available Spare:                    100%
  Available Spare Threshold:          10%
  Percentage Used:                    0%
  Data Units Read:                    627,718 [321 GB]
  Data Units Written:                 1,845,489 [944 GB]
  Host Read Commands:                 5,302,076
  Host Write Commands:                204,065,596
  Controller Busy Time:               107
  Power Cycles:                       117
  Power On Hours:                     765
  Unsafe Shutdowns:                   0

formerly_proven · on Feb 24, 2021

That's an impressive amount of rip and tear for a drive in a laptop.

RcrdBrt · on Feb 24, 2021

I put my stats here for the sake of reporting.

3.5 years old Samsung 970 evo 500GB

SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 28 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 0% Data Units Read: 9,208,241 [4.71 TB] Data Units Written: 21,345,634 [10.9 TB] Host Read Commands: 132,245,782 Host Write Commands: 346,141,507 Controller Busy Time: 924 Power Cycles: 312 Power On Hours: 2,453 Unsafe Shutdowns: 114

seba_dos1 · on Feb 24, 2021

Ouch, that's my only 2TB drive in a laptop running a rolling-release GNU/Linux distro with heavy swap usage, encryption and plenty of over-night compilations for a bit more than a year:

  SMART/Health Information (NVMe Log 0x02)
  Critical Warning:                   0x00
  Temperature:                        36 Celsius
  Available Spare:                    100%
  Available Spare Threshold:          10%
  Percentage Used:                    1%
  Data Units Read:                    56 263 143 [28,8 TB]
  Data Units Written:                 36 077 380 [18,4 TB]
  Host Read Commands:                 1 252 403 456
  Host Write Commands:                1 018 672 820
  Controller Busy Time:               15 360
  Power Cycles:                       234
  Power On Hours:                     10 255
  Unsafe Shutdowns:                   47
  Media and Data Integrity Errors:    0
  Error Information Log Entries:      0
  Warning  Comp. Temperature Time:    0
  Critical Comp. Temperature Time:    0

Your values are indeed insane.

cozzyd · on Feb 24, 2021

That's... close to 2TB / day?

tomxor · on Feb 24, 2021

> Data Units Written: 303,071,863 [155 TB]

That's insane, I don't trust it though:

  155 TB / (75 days *24*60*60) * 2**20 = ~25 MB / s

If we are more reasonable and say it's running for 12 hours out of the day then that works out at a continuous 50MB/s of writes.

For comparison, My daily Linux laptop (XPS 13, 512GB Samsung NVMe SSD) has a total of 3.7 TB of writes over 3 years, this is my work dev and home laptop, it's in constant use (although no video editing):

  3.7 TB / (1095 days *24*60*60) * 2**20 = 0.041 MB / s

There are three orders of magnitude difference there. I can only think of three explanations: 1. the SMART reporting is wrong, 2. MacOS or M1 SSD controllers have serious write amplification issues, or 3. you are actually doing something that does need serious write throughput like lots of video editing (your stat's aren't impossible after all).

yarcob · on Feb 24, 2021

There's an option 4: you are running out of RAM and macOS is doing a lot of swapping.

These SSDs can probably write on the order of 3000MB/s = 0.003TB/s, so you could end up with 155 TB total writes after just 155/0.003/3600 = 14 hours of RAM heavy workload.

tomxor · on Feb 24, 2021

It would have to be a legitimately RAM heavy workload though... running out of RAM due to leaving applications open (or browser tabs open), might result in swapping out those to disk until they are used again. But the level of swapping required to generate this much write usage would essentially be using the disk as RAM, as in the running program doesn't have enough RAM, either due to needing more than physically available or due to buggy memory management.

yarcob · on Feb 24, 2021

Right, I was thinking of things like large simulations where you need to keep a large dataset in RAM and update every datapoint at every iteration.

Or maybe multiple VMs running simultaneously doing CI jobs could also do it.

I don't think simple memory errors like leaks would cause it, since that would just end up filling the disk once, but wouldn't go through writes quite as fast.

cozzyd · on Feb 24, 2021

The symmetry between reads and writes precludes a lot of options (like logging gone mad). I'm not even sure most SWAP ends up read back in...

tomxor · on Feb 25, 2021

That reminds me, when SSDs first came out we'd have to set noatime on nix stuff including macs, to prevents the OS from writing file access times every time a file is read, otherwise it would cause significant write amplification. Modern SSD controllers are clever enough to make this unnecessary these days, but it could be something similar... either with the SSD controllers themselves or a file system behavior spaced just right in time to turn 1 byte into 4096 bytes of effective NAND writes - That's only 24GB of requested writes turning into 100TB in the worst case.

nickflood · on Feb 24, 2021

Yeah, that's a lot. I've got a 5-year old 512 GB Samsung PM951 that's been a system drive in 3 Windows systems with 16-64 gigs of RAM, and over the course of 6200 power cycles and 16500 hours on it has only 35 TB written and 24 TB read

megous · on Feb 24, 2021

How do you even get an unsafe shutdown in a notebook with a battery, unless you yank it out, or a buggy OS that will not shutdown properly/preemptively on low power?

gank41 · on Feb 24, 2021

I've got a number of unsafe shutdowns listed on my M1 MBP, too. Most likely due to the kernel panics I get when rebooting and connected to my dock in clamshell mode.

johnklos · on Feb 24, 2021

Not a joke - are you running Chrome?

sinak · on Feb 24, 2021

I am.

serge-ivamov · on Feb 24, 2021

Just try to run the Activity Monitor.app, click on Disk tab and look for disk-hungry process.

batterylow · on Feb 24, 2021

"kernel_task" ... wrote 142GB in 80 minutes

serge-ivamov · on Feb 24, 2021

looks like swap... can u show

  # vm_stat
  # top -o faults

just 1.5Tb for my 2m old m1 mini.. xcode/firefox

beckler · on Feb 24, 2021

Not OP, but I'll post what I've found:

  > vm_stat
  Mach Virtual Memory Statistics: (page size of 16384 bytes) 
  Pages free:                               28209.
  Pages active:                            114788.
  Pages inactive:                          111624.
  Pages speculative:                         1301.
  Pages throttled:                              0.
  Pages wired down:                         99067.
  Pages purgeable:                          11340.
  "Translation faults":                 340573628.
  Pages copy-on-write:                    8586888.
  Pages zero filled:                    184635739.
  Pages reactivated:                     62859723.
  Pages purged:                          12373577.
  File-backed pages:                        88759.
  Anonymous pages:                         138954.
  Pages stored in compressor:              645107.
  Pages occupied by compressor:            127459.
  Decompressions:                        62433647.
  Compressions:                          82841473.
  Pageins:                               10295559.
  Pageouts:                                177162.
  Swapins:                               16692043.
  Swapouts:                              17670838.

  > top -o faults
  PID    COMMAND      %CPU TIME     #TH    #WQ  #PORT MEM    PURG   CMPRS  PGRP  PPID  STATE    BOOSTS           %CPU_ME %CPU_OTHRS UID  FAULTS    COW     MSGSENT    MSGRECV    SYSBSD     SYSMACH
  533    WindowServer 10.5 06:21:59 21     5    2792- 876M-  209M+  298M   533   1     sleeping *0[1]            0.17876 1.03550    88   24771383+ 131690  354813610+ 136121938+ 237020876+ 554000668+
  2862   Safari       0.0  62:48.95 10     3    7051  480M   6400K  316M   2862  1     sleeping *0[37332]        0.00000 0.00000    501  13919243  76306   67003396   20901240   54405367+  167230473
  2883   com.apple.We 0.0  16:36.04 92     3    926   393M   384K   301M   2883  1     sleeping *19137[3734]     0.00000 0.00000    501  4032470   132     7204377    3215767    45641766+  32182159
  491    mds          0.0  11:00.86 5      2    422   66M    0B     52M    491   1     sleeping *0[1]            0.00000 0.00000    0    3296694   154     6396987    1450677    27494382   5594171
  792    mds_stores   0.0  14:00.57 4      2    93-   72M-   16K    61M    792   1     sleeping *0[1]            0.00000 0.00000    0    2786911   1956    4619344+   1304164+   15524959+  4296749+
  2694   Terminal     7.6  03:17.73 8      2    311   303M-  37M+   104M-  2694  1     sleeping *0[6352]         0.85412 0.15218    501  2034254+  338     1151091+   265791+    1517195+   269400

I'm at 1.32TB on my almost 1 month old MBA (8GB RAM/512GB DISK).

serge-ivamov · on Feb 24, 2021

my info:

  Mach Virtual Memory Statistics: (page size of 16384 bytes)
  Pages free:                               14959.
  Pages active:                            405692.
  Pages inactive:                          374907.
  Pages speculative:                        28768.
  Pages throttled:                              0.
  Pages wired down:                         91521.
  Pages purgeable:                           3805.
  "Translation faults":                  24292564.
  Pages copy-on-write:                     729689.
  Pages zero filled:                     14619003.
  Pages reactivated:                       712889.
  Pages purged:                            265424.
  File-backed pages:                       361509.
  Anonymous pages:                         447858.
  Pages stored in compressor:              270669.
  Pages occupied by compressor:             93710.
  Decompressions:                          405581.
  Compressions:                            790223.
  Pageins:                                1093581.
  Pageouts:                                  3406.
  Swapins:                                      0.
  Swapouts:                                     0.

> top -o faults

  PID   COMMAND      %CPU TIME     #TH   #WQ  #PORT MEM    PURG   CMPRS PGRP PPID STATE    BOOSTS          %CPU_ME %CPU_OTHRS UID  FAULTS   COW   MSGSENT   MSGRECV   SYSBSD    SYSMACH    CSW        PAGEIN
  596   firefox      1.8  25:24.36 101   3    4607  1050M- 24M    197M  596  1    sleeping *0[2679]        0.61835 0.00000    501  1801083+ 15982 30520748+ 10145476+ 37431119+ 74055473+  25506295+  10499
  136   WindowServer 3.8  34:09.24 17    5    2570- 1069M- 3008K+ 84M   136  1    sleeping *0[1]           0.04109 0.69237    88   1554555+ 29134 88175354+ 30598736+ 69071968+ 130099954+ 19109478+  1791
  734   Textual      0.0  01:16.23 9     1    858   144M   16K    25M   734  1    sleeping  0[2051]        0.00000 0.00000    501  833293   1717  745088    111931    478143    1563730    470159     2023
  2195  Xcode        0.0  01:43.32 14    1    856   278M   144K   114M  2195 1    sleeping  0[1168]        0.00000 0.00000    501  701255   3278  1626845   546420    1139890+  1959450    732276+    148341
  2650  Simulator    0.0  01:05.27 4     2    267   27M    0B     8976K 2650 1    sleeping *0[1402]        0.00000 0.00769    501  695662+  211   755586+   680393+   1078007+  2178634+   1065718+   134
  539   Terminal     0.6  01:01.28 8     2    304   107M   25M    16M   539  1    sleeping *0[1375+]       0.06887 0.02264    501  582200+  463   520635+   71597+    365523+   1208051+   360370+    1522
  2209  SourceKitSer 0.0  00:25.22 2     1    21    714M   0B     468M  2209 1    sleeping  0[672]         0.00000 0.00000    501  473966   51040 3676      1103      1047453   5748       17114      12740
  653   Microsoft Re 1.6  13:51.17 33    7    462   536M   12M    148M  653  1    sleeping *0[845]         0.00000 0.00000    501  454751   21517 9976456+  1770226+  11942684+ 47364219+  15135688+  82
  602   plugin-conta 0.0  01:01.70 39    1    278   362M   0B     87M   596  596  sleeping *1[3]           0.00000 0.00000    501  407089   2481  32035     12156     2744155   110109     996847     155
  3576  GarageBand   2.7  09:19.30 23    2    822   1088M  16K    664M  3576 1    sleeping *0[288]         0.01295 0.00000    501  266176   2365  7887976+  76435     9436303+  19319436+  18860820+  930
  292   mds_stores   0.2  01:18.32 5     3    95    24M+   16K    9056K 292  1    sleeping *0[1]           0.00000 0.17381    0    252741+  93    158368+   58900+    1418145+  178062+    301682+    42331
  2305  lldb-rpc-ser 0.0  00:18.56 4     1    62    1049M  0B     201M  2305 2195 sleeping \*0[3]           0.00000 0.00000    501  213604   1380  575628    287827    710177    307114     244010     32467

no swap at all.. but i don't use sleep mode for my system, uptime ~10h. and no rosetta apps too. 16Gb/500Gb M1 Mac Mini.

looks like Safari and WindowsServer use swap all the time. try to shutdown at least 1 per day - maybe it will help.

jonplackett · on Feb 24, 2021

I'm curious is there any correlation with having chosen the 8GB ram version?

There must be a lot of swapping happening to make that work.