If I'm being strictly accurate, I guess that I should acknowledge that I'm not talking about S3 (STR or whatever) but S4 or S5 (depending on how you define those). That is, I am trying to resume from hibernate and a disk-stored memory image, so there are elements of the disk subsystem involved as well.
As far as memory is concerned, I do not see any difference between the 3x2GB OCZ memory I was using and the 3x4GB Corsair that's now in there. Same level of unreliability. In all cases, I have been running 1600MHz memory at 1066 but keeping memory timings at the recommended "1600" values so should be very conservative at 1066.
I have also been tracking the beta BIOSes; I believe that I'm running the latest from Tweaktown (FH5, with the latest Intel RAID firmware included) but I haven't seen any difference with any BIOS since the one that was delivered with the system originally. So to me that says that either it's not BIOS, or it's a BIOS issue that hasn't been fixed yet. I think that I've run FG, FH1, FH3, FH4, and now FH5. I ran some others earlier than that but I don't have my notes in front of me.
I can confirm that disabling ErP in the BIOS (to do with low-power behaviour in S5) doesn't help. Worked OK for a dozen or so hibernate/resume cycles over 2-3 days, then hung again.
Trouble is, of course, that we all have slightly different configurations. I run a GTX460, plus 4 disks on Intel RAID5, plus system disk on the JMB(?) controller.
If we're down to the comparing gut feel level
, my money is on some kind of memory corruption, either during the writing of the memory image back from disk or maybe a hardware glitch during initialisation. This seems consistent with the various stopcodes that get thrown up, or hanging at various times during resume (e.g. while the "resuming..." window is displayed) or shortly afterwards. It depends on when the randomly-corrupted memory location concerned gets executed. Hence, sometimes, the hang or crash happens before the hibernate file-to-memory restore has finished so it is still valid (and that fact that I can usually resume from it suggests that the actual disk image is good) but occasionally the crash doesn't happen until after the restore is flagged as complete, so a Windows restart is needed. I really did like that idea of memory voltage issues but now that I'm still seeing the problem with genuine 1.5V-rated memory, I guess that that one is a non-starter.
I'm inclined to discount hardware initialisation as hangs sometime happen before the resume has finished, and presumably, then, before Windows has let the drivers reinitialise their respective hardware, but this is making rather a lot of assumptions about internal Windows driver behaviour that might not be valid.
Of course, the mechanism might be completely different for the case of S3 resume problems...