- The problem was solved with an iDRAC firmware update provided by Dell (contact Dell to get the right version)
- The odds that you, the reader, are affected are extremely low (unless you have a PowerEdge M610 with dual Intel Xeon 5650 Westmere processors, six 4GB quad-ranked 1066 MHz DDR3 RDIMMs, no mezzanine cards, and only one 10K SAS hard disk. Even then, there may be another mitigating factor that ensures you aren't affected.)
- If you think you are affected, Dell support should be able to quickly tell you if that is the case or not
- We consider the issue to be resolved
The initial blaming of Westmere
We had initially thought that Intel processors were to blame due to a Westmere processor or Westmere documentation flaw (see "Contradictions" in Flaws with Intel Westmere X5650?) The processor kept claiming that it was throttled due to being over temperature. This was definitely not true. That seemed to indicate a processor flaw. As far as I could determine, the available conclusions were that (a) the processor was flawed in that it throttled itself due to thinking it was over temperature when it was not, (b) the processor was throttled due to other reasons but reported it as being due to temperature, or (c) the documentation was completely and totally incorrect for the values I read. We have ruled out (a). (b) is true if (c) is false. (b) may still be true even if (c) is true. (c) is also possible which means (b) might not be true. In easier terms, there is a minor processor bug with performance counters and/or in its documentation. That bug only seems to affect statistics and not function (which I wish I had known about months ago...).
So there was a flaw with Westmere (or the documentation for Westmere). It just turned out to be a minor bug that made us think the other problem was also a Westmere problem. As of now, I haven't gotten the bug confirmed but I have seen nothing that even hints that my conclusion is wrong.
Fixed with a firmware upgrade
Without going into too much detail, there was a problem with the very specific hardware configuration that we have and the interaction with the iDRAC. The issue involved the iDRAC firmware in use previously. Dell has since fixed the bug and provided updated firmware. We hit a very small corner case which they now know to check for.
The problem is now completely fixed as far as we can tell. If you think you are experiencing problems due to this same issue, contact Dell support. They should be able to quickly determine if you are affected and get you the right iDRAC build. It is extremely unlikely that you are affected by this.