Official GIGABYTE Forum

Help my understanding please: GA-870A-UD3

Help my understanding please: GA-870A-UD3
« on: March 21, 2011, 07:54:15 pm »
I am trying understand the behavior of my computer cooling system.  I have the GA-870a-UD3 motherboard with a AMD Phenom II 965BE quad core CPU on stock voltages and frequencies.  This computer was built for distributed computing so it runs flat out at 100% 24/7.  I just recently upgraded the CPU cooler to a Coolermaster Hyper212+.  The case is the Coolermaster HAF922 with 2 200mm fans and one 120mm fan.  The CPU cooler sits right underneath the top 200mm fan so I oriented the new cooler to blow upward toward the top fan.  I also have 2 GTX460 GPU's in the computer.

Now on to the behavior I am trying to understand.  I have used OpenHardware Monitor, CoreTemp and CPUID HWMonitor to monitor the temperature readouts from the motherboard CPU Temp sensor and also the DTS Core sensor on the CPU.  Generally the DTS CPU sensor is running in the 42 - 45 degree range at full load all the time. The thing I am trying to understand is the readout from the motherboard CPU sensor.  Depending on the mix of distributed work currently running on the CPU and GPU's, the CPU temp sensor can either be tracking the DTS sensor very closely within 1-2 degrees or vary as much as 20-25 degrees.  I just recently watched it hit 71 degrees for a bit while the CPU was working on MilkyWay@ Home tasks.  Meanwhile the DTS sensor is still running only about 45 degrees.  I have the BIOS temp warning set at 70 degrees and of course it went off while the CPU temp was working at the 70 degree level.  The question I have is just which temperature readout controls the CPU Fan header.  Based on observing the CPU cooler fan racing up to its max speed, I have to assume the CPU Fan header is controlled by the motherboard CPU Temp sensor and not the CPU DTS sensor.  I would have thought with the supposed accuracy of the DTS sensor tracking just how much headroom is left in the total power output dissipation of the chip, the CPU fan control would track the DTS sensor.  This appears to not be the case.  Can I really believe the readings from the DTS sensor?  The CPU is supposed to be kept under 62 degrees from what I read in the specs.  I am worried that the CPU really is hitting 70 degrees when it gets occupied by a very compute intensive workload. So which sensor do I believe?  The motherboard CPU temp sensor or the DTS sensor?  It's not that one or the other sensor is always wildly off from the other and make it easy to discount one reading or the other.  Right now as I type, both sensors are within a half degree of each other at 43.5 degrees.  Anyone else seen this type of behavior on their distributed computing platforms?  Is it normal and of no consequence?  How do you have the Fan control and warnings set in the BIOS for this motherboard?  What you recommend?

Cheers, and thanks for taking the time to read this and reply.

Keith :-\
« Last Edit: March 23, 2011, 01:00:16 pm by runn3R »

Re: Help my understanding please
« Reply #1 on: March 21, 2011, 07:58:51 pm »
A followup question to my own post.  Which sensor controls the automatic CPU throttling in the BIOS when the CPU gets into power dissipation troubles?  Motherboard or DTS sensor?

Keith

absic

  • *
  • 5815
  • 529
  • Never give up; Never surrender!
    • Bandcamp
Re: Help my understanding please
« Reply #2 on: March 21, 2011, 08:50:32 pm »
Hi there,

There does seem to be some issues with different software and the way that they report the CPU temps. Officially Gigabyte say to use EasyTune6 for monitoring temps whilst AMD say to use AMD Overdrive so no conflicts there!  :P

Normally the most accurate temps being reported are through BIOS but even this can be less than perfect, which leaves the problem what to do and what to trust. Although I don't have your CPU/Mobo combination I have compared many different options both hardware and software for monitoring CPU temps I have not actually found anything that I trust totally. For the most part I use Core Temp as this seems to be the best of the bunch but even this can be glitchy.

The running temp, under load, that your software seem to be reporting are actually the optimum for the Phenom ii CPU's but if you really are peaking at over 71C then you could have issues. If you are running your CPU at stock settings you can try and drop the voltage a little as that would help lower temps but it might introduce instability if you pull it back too far.

With regard to your second question you might do better actually asking GGTS here: http://www.gigabyte.com/support-downloads/technical-support.aspx just choose motherboard and your Country of residence from the drop down list. It could take them 7 - 10 days to answer your question so don't be surprised if you don't hear anything immediately.

In the meantime I will do some more digging and see if I can come up with a definitive answer for you but hopefully someone else might know and reply!
Remember, when all else fails a cup of tea and a good swear will often help! It won't solve the problem but it will make you feel better.

Re: Help my understanding please
« Reply #3 on: March 21, 2011, 09:55:52 pm »
Thanks for the reply absic.  I tried to use the EasyTune6 software but it just messed all the fans and temps up with incorrect control. As I am not trying to overclock or tune up the computer, stock is fine for stability, I just use OpenHardware mostly because it has a nice gadget that allows me to see all the pertinent temps, fan speeds and CPU/GPU utilization at a glance. Not sure I made myself clear when I said I hit 71 degrees on the motherboard sensor.  I have never seen DTS temps higher than 45 degrees even when the motherboard sensor was hitting 70 degrees and the CPU cooler fan maxes out at 2000 rpm.  At least with this new cooler I am not bothered by the racing noise like the stock cooler when the CPU's get cranked by MW work. The temps on both sensors track usually unless I get a specific mix of work, mostly the MilkyWay@Home 0.50 separation tasks.  They for some reason really work the CPU cores very heavily compared to other projects work. I also have never seen any higher temps on the CPU cooler base than about 45 degrees when I take a reading with my IR temp gun.  This reading is when the motherboard CPU temp sensor is reporting 70 degrees but the DTS sensor is reporting 45 degrees. As an aside, I have seen very good accuracy and repeatability of readings with the IR gun against the sensor readings, except for the Northbridge sensor of course which inaccurately reports a constant 81 degrees.  I believe this is a well known symptom of this motherboard and since I don't measure more than about 55 degrees on the Northbridge heat sink, I don't worry about it.  I'll go and post the question about which sensor controls the throttling on the Tech Support page as you suggest.  I have only once seen a throttling event and I couldn't pin it on any abnormal temp. I think the system got confused for a while as I simply rebooted and continued on with the same workload and the CPU speed stayed at maximum frequency as normal.  Thanks again for the reply.

Keith
 ;D

bytheway_r

  • 271
  • 21
Re: Help my understanding please
« Reply #4 on: March 21, 2011, 10:07:27 pm »
Your situation is pretty strange. If you can live without the CPU for a while it'd probably be replaced through RMA. Otherwise, if you don't mind that your temp sensors are most likely faulty then it's fine. With that case and cooler you won't exceed recommended temps running stock settings ( as long as your ambient stays within reason ).

Normally, your CPU temp should be lower than your Core temps by up to about 10 degrees under load ( the difference can be even bigger sometimes ). Now, the 62 degrees limit is just a precaution on AMD's part. Most importantly, temp sensors can be terribly inaccurate and secondly, some people will push their chips way beyond limits so it's best to set them conservatively. The truth is, there should be no damage to the chip below about 90 degrees and it'll most likely shut down by itself before reaching that temperature. Even if the 71C reported were real, taking inaccuracy into account, it should still be below 90 degrees.

I say faulty sensor, though. I'm not about to believe that CPU temp could be over 20 degrees more than cores.

As for fan speed and probably all CPU related behaviour, it's controlled by CPU temp.

Lastly, I'm rather certain that the CPU sensor is located on the CPU itself and not the motherboard ( word is, it's just below the heatsink contact area ).

Re: Help my understanding please
« Reply #5 on: March 21, 2011, 11:19:41 pm »
Hi bytheway_r.  I'm confused by your commentary.  You mostly say that you think the motherboard sensor is faulty, yet recommend that the CPU chip be RMA'd.  My logic says that the motherboard should be RMA'd.  Also, from what I have read from many sources in my attempt to try and understand what is going on with my computer, the CPU Temp sensor reported in the BIOS and exposed through the I/O monitoring chip is reading a thermal sensor usually embedded directly under the CPU socket.  When I installed the backplate for the new CPU cooler, I had a good look at this area since the backplate window had to be centered around the various surface mount chips.  I am pretty familiar with chip technology as far as componentry as I have worked with the hardware for 30 odd years or so.  I thought I identified the likely SMD thermistor under the socket area.  Maybe not.  I understand that the Core DTS (Digital Thermal Sensors) temp sensors are actually on the chip die and have their own register.  Its between these two disparate sensors I am trying to grasp their behavior.  From what I have read, for the AMD Phenom II chips, there is only one DTS sensor on die even though monitoring programs will list Core temps for all cores, they will all read the same.  I would most suspect the inaccuracy in the temp readings to be from the motherboard thermistor under the socket.  Since it is an analog device, it is easily damaged and its thermal co-efficient can change over time or through extreme temperature events.  Unless you have an ability to calibrate the temp readout through use of an offset, all one can hope for is that the original chip calibration mostly stays the same as at assembly time.

I would agree with your assessment that my system is pretty strange.  Thus the question to the experts.  For 75% of the time, the two different CPU temp sensors exposed by monitoring software agree within a degree of each other.  I wonder if your commentary about the CPU temp being normally lower than core temps is based on the typical user where the only time their CPU's are at 100% usage is when they are playing games.  And that is just for small periods of time.  My temps are very stable as from the first instant the computer is turned on, it goes immediately to 100% utilization and just stays there. Within about 2 minutes or so, the temps are stabilized, I believe the system has reached thermal equilibrium.  It's just when I hit this particular workload from the MilkyWay project that I get the divergent temps. Thanks for the confirmation of my deduction that fan speed is controlled by the motherboard CPU temp sensor.  That certainly is what I observe when the motherboard CPU temp sensor takes off northward of 45 degrees and the CPU fans speed scales accordingly.

Cheers,  Keith
 :D

P.S. My idle temps are only about a degree or so above ambient room temps, usually around 20-22 degrees.  The new CPU cooler lowered my full load temps about 6 degrees from my previous cooler.

bytheway_r

  • 271
  • 21
Re: Help my understanding please
« Reply #6 on: March 22, 2011, 12:33:34 am »
I'm confused myself. I'm trying to find something even remotely definitive regarding sensors on Phenom II's and how they work. No such luck. There are as many ideas out there as you can think of. The only piece of info I've found so far is this:

Quote
1) The previous eratta for Phenom IIs reported that several of their AM3 chips (including the X6) had faulty temperature sensors. The latest errata "corrected" that and now states that only AM2 chips have faulty temp sensors. Huh.

2) There's no way to get core temps on AMD chips. The temperature reported by coretemp is what AMD calls "tctl". Here's what AMD says about tctl:
http://support.amd.com/us/Processor_TechDocs/41256.pdf
Quote:
Tctl is a non-physical temperature on an arbitrary scale measured in degrees. It does not represent an actual physical temperature like die or case temperature.
For Tctl = 0 to Tctl_max - 0.125: the temperature of the part is [Tctl_max - Tctl] degrees under the temperature for which maximum cooling is required.
For Tctl = Tctl_max to 255.875: the temperature of the part is [Tctl - Tctl_max] degrees over the worst-case expected temperature under normal conditions.
The default value of the HTC temperature threshold (Tctl_max) is specified in the Power and Thermal Datasheet.
Now, in the Power and Thermal Datasheet, Tctl Max is the same as Tcase Max for most processors. However, it can vary by up to +/-15C for some processor models.
http://support.amd.com/us/Processor_TechDocs/43375.pdf

So, we know that Tcase Max is the maximum operating temperature for an AMD CPU, as measured on the CPU casing (aka integrated heat spreader). We know that tctl is a totally arbitrary, non-physical value for the current CPU temperature. And we know that Tctl Max is typically equal to Tcase Max, the totally real, physical max temp of the CPU.

It's clear as mud to me.

From a quick look at these documents ( especially the first one ) I still have no idea whether the sensor is somewhere in the CPU itself or in the motherboard. The core temp issue is somewhat sketchy, too.

Frankly, it's almost impossible to figure this kind of info out. Unless you have extensive knowledge about electronics used in CPUs and know quite a bit about how CPUs work ( in terms of hardware operation ) and are willing to take one of these apart to take a look inside, then we're not very likely to get a proper answer.

What's killing me is that asking AMD or Gigabyte won't help. Ask 2 different people and you'll get 2 different answers. Both could be incorrect, too. You'd have to get your hands on one of the engineers that worked on Phenom II sensors to be sure.

Bottom line, as I've stated previously, with that case and that cooler at stock settings nothing will be overheating. If anything, you could try reseating the heatsink to be sure it's mounted properly. I doubt it's that but it's the easiest thing to check.

Re: Help my understanding please
« Reply #7 on: March 22, 2011, 12:51:05 am »
Yes, color me confused also.  Those links are some of the same docs I have been reading.  The core temps for the AMD Phenom II processors revision RB-C3 (aka 125W TDP) seem to be based on a formula described as VALUE/8.  What value might be is only known by AMD apparently.  As you said, with my case air flow directionality and the high air flow and cooler real estate, I am really not too worried about overheating the CPU.  The IR temp gun is my sanity check also. My finger tells me mostly what I need to know.  As I said, I've been working on electronics for over 30 years ..... if you can keep your hand or finger or upper lip on a electronic device indefinitely, it is running well within thermal limits and should live forever if it gets past the usual 30 day infant mortality syndrome and doesn't suffer any unusual temp or voltage events.  I was just real curious as to my observed behavior with my distributed computer workload.  I think I will next post this same question on the forums for the distributed computing projects I belong too, particularly MilkyWay@Home since it is that work that shows the temp discrepancy the most.  Thanks for your help.

Cheers,  Keith  8)

bytheway_r

  • 271
  • 21
Re: Help my understanding please
« Reply #8 on: March 22, 2011, 01:07:48 am »
We have the same approach, then ;D. Kind of why I'm starting to wonder about heatpipe connected motherboard heatsinks. From what my fingers are telling me, my NB temp went down compared to budget boards but the heatsink covering my mosfets is quite warm. Honestly, I think that heat from the NB is being transferred to mosfet heatsink causing them to run hotter than ones without a heatsink at all. Which would make my day what with people telling you on overclocking forums that you'll blow your mosfets if you don't have a heatsink on them.

Re: Help my understanding please
« Reply #9 on: March 22, 2011, 06:36:32 am »
Interesting observation on your part.  My motherboard doesn't have any kind of heat sinks on the mosfets.  There is a poorly shaped heat sink on the northbridge with a gaudy Gigabyte badge on the top of the fins that certainly obstructs air flow through the heat sink.  I can easily put my fingers on the mosfets and hold it without discomfort of any kind.  The hottest thing on the motherboard is the northbridge heat sink base that is in contact with the chip.  Definitely hotter than anything else and I can still hold my finger on the heat sink as long as necessary.  55 degrees C or about 130 degrees F.  With my case and fans I easily flow enough air around the surface of the motherboard near the mosfet area.  Also the northbridge heat sink is directly in front of the CPU cooler fan air intake and benefits from the air flow through the area.  After first seeing the high northbridge temp readout, I second guessed myself thinking I might have made a better choice in motherboards and should have chosen one with heatpipe coolers. I've looked around and now think I made a pretty good decision based on the feature set that my board offers.  Lots of fan headers that made it easy to hook up all my case fans and CPU fans without adapters.  I could have gone with a 890FX chipset in order to get two PCIe 2.0 X 16 channels, something desirable for SLI or Crossfire fans but I don't really suffer much lesser thruput running my older GPU in a  X4 channel.  I don't see much difference in processing times for my distributed computing tasks.  ;D

Keith

Dark Mantis

  • *
  • 18405
  • 414
  • 10typesofpeopleoneswhoknow binaryandoneswhodont
    • Dark Mantis
Re: Help my understanding please
« Reply #10 on: March 22, 2011, 03:50:47 pm »
Yes I quite agree. Often when replacing the stock cooler with an aftermarket version the CPU get a nice drop in temperature but the surrounding chipsets actually get hotter. I am watercooling my northbridge along with the rest of the board and GPU but it has the same effect on the mosfets and they were getting quite hot. I made and installed a dedicated fan just to cool them. You can see in my photos here:

http://forum.giga-byte.co.uk/index.php/topic,2373.0.html
Gigabyte X58A-UD7
i7 920
Dominators 1600 x6 12GB
6970 2GB
HX850
256GB SSD, Sam 1TB, WDB320GB
Blu-Ray
HAF 932

Gigabyte Z68X-UD5-B3
i7 3770K
Vengeance 1600 16GB
6950 2GB
HCP1200W
Revo Drive x2, 1.5TB WDB RAID0
16x DLRW
StrikeX S7
Full water cooling
3 x 27" Iiy

Re: Help my understanding please
« Reply #11 on: March 22, 2011, 07:44:51 pm »
Dark Mantis, that's a nice looking rig!.  I don't think I've stumbled upon that specific water-cooling product before and will have to look it up.  Thanks for the link. ;D

Dark Mantis

  • *
  • 18405
  • 414
  • 10typesofpeopleoneswhoknow binaryandoneswhodont
    • Dark Mantis
Re: Help my understanding please: GA-870A-UD3
« Reply #12 on: March 22, 2011, 08:33:38 pm »
Thanks Keith, I'm glad you like it but it is in a constant state of flux and is always changing. I have a couple, more bits that I am reviewing at the moment and then going to be fitting into the loop.  8)
« Last Edit: March 23, 2011, 01:01:05 pm by runn3R »
Gigabyte X58A-UD7
i7 920
Dominators 1600 x6 12GB
6970 2GB
HX850
256GB SSD, Sam 1TB, WDB320GB
Blu-Ray
HAF 932

Gigabyte Z68X-UD5-B3
i7 3770K
Vengeance 1600 16GB
6950 2GB
HCP1200W
Revo Drive x2, 1.5TB WDB RAID0
16x DLRW
StrikeX S7
Full water cooling
3 x 27" Iiy

Re: Help my understanding please: GA-870A-UD3
« Reply #13 on: March 23, 2011, 08:24:53 pm »
Just a quick observation.  Since I never really got any kind of definitive answer as to why I observe what I observe with my system, I also posted basically the same observation and question over in the MilkyWay@Home forums.  Mainly what do others observe when processing the kind of work that makes my system freak out.  Sadly, no responses of any kind so far. I don't know whether that means no one else observes my questionable behavior or whether the question itself is so far off the usual thread content that no one has found it interesting. It might just turn out to be one of those mysteries of the universe that will never get answered to anyone's satisfaction.

Cheers, Keith
 ???

absic

  • *
  • 5815
  • 529
  • Never give up; Never surrender!
    • Bandcamp
Re: Help my understanding please: GA-870A-UD3
« Reply #14 on: March 24, 2011, 08:58:07 am »
Hi again Keith,

Trying to get to the bottom of these kind of issues can be really difficult and quite often people just give up on them.

I understand that you have raised a query with GGTS and hopefully they will give you an insight and point you in the right direction. So far, my own digging in to this issue hasn't thrown up any practical answers but I will keep looking.
Remember, when all else fails a cup of tea and a good swear will often help! It won't solve the problem but it will make you feel better.