So I turned on the monitor screen for my server today and saw this message on the login screen:
kernel: CPU0: Temperature above threshold, cpu clock throttled (total events = 3851 )
kernel: CPU0: Temperature / speed normal
The only thing that came to my mind was "what the hell... ? ? " I monitor my hard drive temperatures religiously, but because I never can get CPU temperature monitoring to work right I never bother with it ( with my pick-of-the-litter luck I seem to always get hardware that isn't supported). So I just keep my systems well cooled and I always set the BIOS option of "when temp gets above XX degrees C turn it off".
When I saw this message, I was quite shocked. So I did some digging. Sure enough /var/log/kern.log has some more info (it throttled and then 5 minutes later returned to normal) but nothing about how hot the temperature was nor about where I could get such info. So off to Google. It turns out that the Linux kernel monitors such things for you now (learn something new every day). However, I can't find out where such information would be stored.
I dug around in /proc/acpi and it did have some information about the CPU but nothing about the temperature. The processor is a P4 2.2 and there is nothing special about it as far as I can tell on a standard Intel motherboard. So I caved and started installing more programs (Horay! Just want you want to do with your server kids! Installing random packages helps you clutter your drive which is important for servers. Remember, if it's not broke your not trying!" ).
The BIOS sees the fans, knows their speed, and has the three temperature zones listed. It shouldn't be too difficult to just read what it is already capturing right?
First up lm-sensors. Did the install, configured and compiled the modules, then modprobe'd the modules and run! ...or not...because even though it finds 1 of the 6 fans in the case it doesn't know what to do with it and of course my motherboard is not supported for anything else....
Next up MBmon. This is the one program that I have had best results with in the past...and once again my motherboard is not recognized.
OK lets try gkrellm. Can we guess the results? Not supported. Dah....
OK if the kernel can figure out that the temp is too hot, why can't at least one of three more popular programs for monitoring CPU temperatures figure it out? but more importantly, does anyone know how to find such info out from the kernel? It apparently has figured it out but it just isn't saying (or I am looking in the wrong spot).
OK well its dinner time, so I will start searching Google again when I get back. In the meantime, if anyone has any ideas please let me know.
Thanks!
~Stack~