Now let's take a look at the impact of having 50% of the disk occupied with a static file while the rest is written to repeatedly. What happens to write amplification - does the rate go up as the controller has to move data around to ensure the flash wears evenly? And by how much - do the figures indicate that there is a high degree of tolerance in different wear levels on individual blocks before background data movement takes place? We're starting to get into the nitty-gritty of the controller's behaviour and will see just how well it copes under reasonable stress. We'll start by looking at a couple of sets of SMART stats where the wear levelling count drops by 1%:
ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE RAW_VALUE 5 Reallocated_Sector_Ct 100 100 010 Pre-fail 0 9 Power_On_Hours 099 099 000 Old_age 1371 12 Power_Cycle_Count 099 099 000 Old_age 160 177 Wear_Leveling_Count 091 091 000 Pre-fail 485 179 Used_Rsvd_Blk_Cnt_Tot 100 100 010 Pre-fail 0 181 Program_Fail_Cnt_Total 100 100 010 Old_age 0 182 Erase_Fail_Count_Total 100 100 010 Old_age 0 183 Runtime_Bad_Block 100 100 010 Pre-fail 0 187 Reported_Uncorrect 100 100 000 Old_age 0 190 Airflow_Temperature_Cel 063 060 000 Old_age 37 195 Hardware_ECC_Recovered 200 200 000 Old_age 0 199 UDMA_CRC_Error_Count 100 100 000 Old_age 0 235 Unknown_Attribute 099 099 000 Old_age 28 241 Total_LBAs_Written 099 099 000 Old_age 1030221277903
ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE RAW_VALUE 5 Reallocated_Sector_Ct 100 100 010 Pre-fail 0 9 Power_On_Hours 099 099 000 Old_age 1436 12 Power_Cycle_Count 099 099 000 Old_age 160 177 Wear_Leveling_Count 090 090 000 Pre-fail 546 179 Used_Rsvd_Blk_Cnt_Tot 100 100 010 Pre-fail 0 181 Program_Fail_Cnt_Total 100 100 010 Old_age 0 182 Erase_Fail_Count_Total 100 100 010 Old_age 0 183 Runtime_Bad_Block 100 100 010 Pre-fail 0 187 Reported_Uncorrect 100 100 000 Old_age 0 190 Airflow_Temperature_Cel 065 060 000 Old_age 35 195 Hardware_ECC_Recovered 200 200 000 Old_age 0 199 UDMA_CRC_Error_Count 100 100 000 Old_age 0 235 Unknown_Attribute 099 099 000 Old_age 28 241 Total_LBAs_Written 099 099 000 Old_age 1155002071365
The volume of data written to cause this drop (taken from LBAs written, where each LBA is 512 bytes) is:
(1155002071365 - 1030221277903) * 512 = 63.89TB.
This is less than when previously measured, at 66.25TB, but still gives an estimated life for the disk of nearly 6.4PBW.
Now turning to measuring write amplification, we'll look at two more sets of SMART stats.
After run 511:
ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE RAW_VALUE 5 Reallocated_Sector_Ct 100 100 010 Pre-fail 0 9 Power_On_Hours 099 099 000 Old_age 1347 12 Power_Cycle_Count 099 099 000 Old_age 160 177 Wear_Leveling_Count 092 092 000 Pre-fail 464 179 Used_Rsvd_Blk_Cnt_Tot 100 100 010 Pre-fail 0 181 Program_Fail_Cnt_Total 100 100 010 Old_age 0 182 Erase_Fail_Count_Total 100 100 010 Old_age 0 183 Runtime_Bad_Block 100 100 010 Pre-fail 0 187 Reported_Uncorrect 100 100 000 Old_age 0 190 Airflow_Temperature_Cel 063 060 000 Old_age 37 195 Hardware_ECC_Recovered 200 200 000 Old_age 0 199 UDMA_CRC_Error_Count 100 100 000 Old_age 0 235 Unknown_Attribute 099 099 000 Old_age 28 241 Total_LBAs_Written 099 099 000 Old_age 985300191097
After run 806:
ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE RAW_VALUE 5 Reallocated_Sector_Ct 100 100 010 Pre-fail 0 9 Power_On_Hours 099 099 000 Old_age 1503 12 Power_Cycle_Count 099 099 000 Old_age 160 177 Wear_Leveling_Count 089 089 000 Pre-fail 609 179 Used_Rsvd_Blk_Cnt_Tot 100 100 010 Pre-fail 0 181 Program_Fail_Cnt_Total 100 100 010 Old_age 0 182 Erase_Fail_Count_Total 100 100 010 Old_age 0 183 Runtime_Bad_Block 100 100 010 Pre-fail 0 187 Reported_Uncorrect 100 100 000 Old_age 0 190 Airflow_Temperature_Cel 067 060 000 Old_age 33 195 Hardware_ECC_Recovered 200 200 000 Old_age 0 199 UDMA_CRC_Error_Count 100 100 000 Old_age 0 235 Unknown_Attribute 099 099 000 Old_age 28 241 Total_LBAs_Written 099 099 000 Old_age 1279782869530
Each test run writes 476GiB, so we can calculate the total data written during this period of the test from the number of test runs completed:
(806 - 511) * 476GiB / 1024 = 137.13TiB.
This correlates precisely with what the SSD has recorded as 'Total LBAs Written':
(1279782869530 - 985300191097) * 512 / (1024 * 1024 * 1024 * 1024) = 137.13TiB.
Over the same period, the wear levelling count has gone up from 464 to 609 - a total of 145. From the results in our previous post ’Estimating Reserved Space in the SSD’, this equates to:
145 * 0.9885TiB = 143.3325TiB
And from this we can work out the write amplification factor:
143.3325TiB / 137.13TiB = 1.045.
This is a tiny level of write amplification, and indicates that the controller is doing a fantastic job of minimising the impact of having a large static file present. The controller is clearly tolerating widely varying levels of wear across the SSD, only moving data when really necessary to ensure evenness. It will be interesting to compare this result with behaviour when the disk is 85% full - coming soon!