Joined: 01 Aug 2006
Location: Richland, Washington State, USA
|Posted: Fri Oct 20, 2017 8:02 pm Post subject:
|Macro Photog wrote: |
|This is a follow up to Rik's offer to help me with low utilization and resulting slow processing of my stacks and as it turns out, slow everything else. Rik very kindly offered to do a screen share as other information I sent him and suggested testing did not yield any clues as to what was wrong with the system. For the screen share he sent me a program that gave a more granular look into the system (CPUID Hardware Monitor) to help us along.
Almost immediately he spotted the issue. My processor was running extremely hot (+100 C) and throttling the cores to prevent a thermal melt down. The culprit was the heat sink and fan for the CPU. Though it was not visually apparent, it had become slightly unseated and was providing little to no thermal protection. The HS/fan combo is attached to the motherboard with nylon "pylons" that are splayed open when plastic pins running through the pylon cores are pushed down. The weight of the HS/fan when the system is upright, coupled with the heat of the CPU may have caused the pylons to deform over time. In any event it was providing the CPU with little if any protection. Also, the thermal compound had become dry and hard.
After the call I re-seated the HS/fan and temps came down to acceptable levels. However when running PMax and all cores were running at near 100% there were extremely brief periods when some of them were being throttled to 99%. Let me be clear this was likely in milliseconds and would have worked but I wanted to be able to run PMax full out, no throttling. I purchased an all in one liquid cooling kit and installed it. I am extremely pleased with the result. Running at 100% all cores, no core runs above 54 C. The old processing time (PMax align and stack), image to image was approximately 31 seconds (file size 211MB files). The processing time with liquid cooling is approximately 6 seconds. Faster times may be achieved with some additional tuning in the future.
FYI...the processor is and Intel i7 4770K, Let me know if you want any other system specs.
I want to thank Rik for his time and expertise. Without him I would still be slogging along trying to resolve the issue with Resource Monitor or worse spending a lot of money for a new system to solve the issue.
I'm glad I could help!
Adding some information to Macro Photog's account...
Of course the root cause in this case was the cooling problem. But the diagnosis was made much harder by some behavior of Windows Resource Manager that I completely did not expect.
Consider these displays (cropped from a WRM window that appeared in the TeamViewer session that I recorded and reviewed using Camtasia):
The implication seems obvious: each processor is spending a lot of time waiting on something. But what?
On the other hand, the CPUID HWMonitor utility (https://www.cpuid.com/softwares/hwmonitor.html) shows the following picture for about the same situation.
(The columns are current value, followed by min and max seen in this HWMonitor session.)
Aha! In fact the processors are not waiting for any events. They're running almost flat out at 95-98% utilization. The problem is that their clocks are only running at 1/4 the speed they ought to be. And that in turn is caused by the processors throttling their own clocks in a successful effort to avoid self-destruction by overheating.
The thing I am most annoyed about, now that I understand the problem, is that Windows Resource Manager clearly knew that the low "utilization" was due to the clocks running slow, but nothing in the system raised an alarm about that. If we carefully turn our eyes to the right place in the WRM window, we can see the key bit of info:
But still no indication about why the clocks are slow.
Oh well, now we know. One more thing to put on the standard list of things to check.