I'm experiencing a rare issue of data corruption on my newly-purchased GA-970A-D3 v1.1.
It happens that for every 5 GB I copy, ocasionally I get around 1-4 mis-copied bytes.
Strange thing, is, the mismatched bytes are always within one digit of their intended value. Say, if it was supposed to write hex F3 it may write F4 or F2.
Also strange is that it doesn't always happen. Yesterday I managed to copy about 30 GB with no errors.
Occasionally, when performing intensive storage operations, like several simultaneous copy operations or copy+verify, the system may BSOD or crash altogether.
On the other hand, apart from the storage issues the system seems stable, and I'm even to play demanding 3D games with no stability issues.
My setup is:
- GA-970A-D3 v1.1
- AMD FX-4170 (125W)
- 2 x GSkill DDR3 1600 2 GB modules in dual channel (slots 1 and 3)
- Samsung Spinpoint F3 1 TB SATA2 HDD
- Crucial M4 128 GB SATA3 SSD
- GeForce GTX560
- Thermaltake TR2 600W PSU
I don't overclock. Anything.
Having an SSD and a NCQ-enabled HDD, I'm obviously running the SATA controllers in AHCI mode, which I'm coming to think may be the culprit.
Now, here's stuff I've ruled out:
- Hard drives: does the same thing in both my SSD and HDD. USB drives seem unaffected.
- SATA cables: replaced the cables for both the SSD and HDD.
- SATA connectors: tried switching the cables around the six SATA connectors in the motherboard.
- Memory: ran Memtest86 for a whole night, no issues. Also tried swapping all of my memory with other modules just in case.
- Processor: ran Prime95 for a whole night, no issues, temperature stays below 55ºC.
- Operating system: did clean install of Windows 7 and Windows 8, installing no software whatsoever before doing the tests.
- Software issues: corruption has been verified by WinRAR, uTorrent, CDCheck, and commandline FC /B.
- AHCI drivers: tried both the Microsoft built-in ones and the AMD-provided drivers
- BIOS: tried reverting to version F10, as F11c is marked as "beta BIOS" and coincidentally updated the AHCI BIOS.
- CMOS settings: reset several times.
- PSU: tried a brand new one.
- Video: tried a GT210 to make sure the GTX560 wasn't too much of a power hog.
- Power connections: triple-checked all motherboard, video, processor and drives' power connectors to make sure they were properly seated.
- Case: tried running the PC without it to rule out interference or any short-circuits in the mountings or whatever.
- Memory running at 1600 MHz: run them at 1333, the default after CMOS reset.
I tried to RMA the motherboard and the technician tried it in front of me and he seemed able to copy a 16 GB load with no errors, but when I got home I saw he had reset the CMOS and had been running with AHCI disabled, so that could confirm my suspicion that I'm a victim of an AHCI bug in the chipset.
I have yet to perform conclusive testing with AHCI disabled (will do it later today), but I'm running out of things to try.
If this was the case, what would be my options?
Running without AHCI is obviously not a viable solution.
In any case, any suggestions or further things to try will be greatly welcome.