Re: WD Blue 510 SSD and strange write performance

From: mike tancsa <mike_at_sentex.net>
Date: Sun, 17 Mar 2024 12:03:44 UTC
On 3/17/2024 4:32 AM, Andrea Venturoli wrote:
> On 3/15/24 19:17, mike tancsa wrote:
>
>> (da5:mpr0:0:15:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, 
>> reset, or bus device reset occurred)
>
> Hello.
> I know I'm probably blaming the wrong component, but is your PSU up to 
> the task?
> How many drives do you have? Are they power-hungrier than the others 
> you tried (Samsung ???)?
> Do you have a spare PSU to test/add?
>
> Probably this is not the cause... still, before you bit farewell to 
> 400 bucks...
>

hehe, thanks Andrea :)  I too dont want to be out the money. Power 
supply for sure is a good thing to check. In this case, the main server 
chassis is sized with a couple of redundant 1000W power supplies that 
should handle 12 full HDDs. Pretty sure in this case 6 SSDs should not 
stress it beyond the point. But I had 2 other test boxes on the bench 
and the one common variable seems to be the WDs.

I feel like this is a sunk cost I am pushing myself into, but I did do 
some more testing.  My co-worker came across this post which was 
interesting.

https://forum.hddguru.com/viewtopic.php?f=10&t=43284

The very last entry says

"For WD BLUE SA 510 there are some problems with this type of SSD. This 
YODA model
To fix the SSD if it is still recognized, use the firmware update tools.
And then do a secure erase or full wipe of the SSD. After this it will 
work well. I can give you a link to this utility if it necessary. Also 
ossible download it from manufacture FTP.
If it is not recognized by the computer or is identified as a SSD 
device, there only one way, use production tools with new firmware to 
begin the production process by testing the controller and NAND chip and 
forming a translator. The SSD will be like brand new.
"

After I did the erase, the tests worked for a good 5 cycles and 
performance was MUCH smoother and consistent. But then the drives 
started to fail again.  So I really wonder if TRIM has something to do 
with it as my test is essentially writing a 250G data set with about 28 
million txt files, destroying the dataset and then copying it again.

I noticed these 2 commits for other drives. I wonder if the WD is having 
similar issues.

https://cgit.freebsd.org/src/commit/?h=stable/14&id=bf11fee6a5cf97102f87695185cadb63d5a2a7de
and
https://cgit.freebsd.org/src/commit/?h=stable/14&id=50aa22323424ccea00ef5d8f24e729a480cc77eb

I hope you dont mind bcc'ing you Andriy.  I noticed you only added the 
NCQ quirks for CAM ata and not for CAM scsi. I am running into odd 
issues with some WD drives and wondering if there is the same root 
limitation of these WD SA 510 drives like the Samsungs ? However, in my 
use of the Samsungs I have not been able to trigger these bugs so far.

     ---Mike