|
Greetings Kind Regards
May I please inquire of your kind selves who of course are much more knowledgeable than myself meaning of Task Manager CPU Utilization reported percentage. I wish to know what it is in fact a measure of. All the answers I have found on-line are along the lines of "The percentage you see (e.g., 25%) represents the proportion of the CPU’s capacity that is actively in use. It indicates how much of the CPU’s processing power is currently allocated to running tasks and processes." Web search of term "capacity" seems to result as identical to speed id est bits or perhaps instructions per second similarly for "processing power".
I am motivated to this inquiry as my intel i7 reported % is usually mid 60s even whilst performing several simultaneous tasks exempli gratia a build and two console programs each running random testing.
Other than not knowing what in fact is being measured I do not understand why the CPU would not run at 100% as I assume doing so would complete assigned tasks sooner rather than later. Perhaps it is a matter of temperature throttling. My temperatures are typically mid 60s.
So to paraphrase inquiry if reported % is exempli gratia 50 does this mean the CPU is sitting idle half the time? If so why?
Thank You Kindly
|
|
|
|
|
In large part, because your computer has multiple cores.
You cannot just break a task down to use all cores.
Add 2 + 2 using two threads to divide the task between two cores. You can't.
So if you add 2+2 your CPU will not spike to 100. If it had two cores, the best you'd get is 50%, 4 cores - 25%, etc.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
At this moment Task Manager is reporting :
Utilization 23%
Speed 3.84 GHz
Base Speed: 3.41 GHz
Cores: 4
Logical processors: 8
Processes 356
Threads 4960
Handles 189319
The graph of each of the eight logical processors shows more or less the 23%.
w/ so much to do why is the CPU deciding to utilize only 23% of itself?
|
|
|
|
|
Because it is performing tasks that cannot be broken across multiple cores.
You have 4 cores. For each task that cannot be broken up, the most it can use is 25%
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
Thank You Kindly for your assistance. At this moment Utilization is reported as 54%.
|
|
|
|
|
In simple terms:
There's always a ton of stuff going on with modern OSes, even those who designed them will have a hard time keeping it all in their heads and probably couldn't present a complete and accurate picture of what might be going in at any given time.
With that out of the way:
What problem are you trying to resolve? Does the exact CPU utilization figure bother you?
|
|
|
|
|
Thank you for your kind reply. I am not attempting to solve any problem in particular. I merely wish to learn. In particular I am curious why CPU utilization is so low as I am inclined to naively assume it is more beneficial for the user for the CPU to utilize 100% of its resources so as to complete its tasks sooner rather than later. Would that not be the case?
-Cheerios
|
|
|
|
|
You're not entirely wrong...but it's all about spreading the load.
Back in the single-core systems days, it was very common to see Task Manager show a single task/process use up 100% of the CPU. Windows went out of its way to remain responsive, but you knew that when this happened - for more than a few seconds at a time - the system started to get wonky. That's a technical term.
With 2 CPUs, a single-threaded process using up all the processing power it can get will be shown as using 50% of the processor. You don't get the task split up across both CPUs "for free", it has to be coded to take advantage of multi-threading. That being said, if a process can spawn 2 threads, each hitting the CPU as hard as it can, you might see CPU usage go beyond that 50% and closer to 100% (but it'll hardly ever remain exactly at 100% no matter what).
The same pattern repeats for systems with 4 or 8+ cores. With 4 cores, a single-threaded process hitting the CPU hard will show 25% usage; with 8 cores, 12.5%. If the process starts spawning more threads, then you'll see the CPU usage climb.
I'm oversimplifying, but I generally find most people are happy with this sort of explanation and leave it at that.
|
|
|
|
|
Thank you kindly. I do not fall in the happy group. If the machine were to deal w/ no more processes than cores than of course it is easy to be so happy. However such is not the case. There are tens hundreds perhaps even thousands of processes on my 4 Core machine.
-Kind Regards
|
|
|
|
|
Right...but you won't ever get a single core dedicated to a single process, even if you could get 1:1 process:core ratio.
And not all running processes have something to do all the time, so even though you have more processes than cores, and each core gets to do some work that a process needs to have taken care of...you'll hardly ever see all cores busy at 100%.
In theory, if you were to add up the CPU utilization from all processes at a given time, plus the System Idle Process entry shown in Task Manager...you might get close to 100%.
Are you thinking about this in terms of "caching is good, because unused memory is wasted memory"? I'd say it's not so with CPUs...pushing CPUs to their limit, all the time, would draw a lot of power, and generate a lot of heat.
|
|
|
|
|
Yes I am thinking in terms of "if 23% is good 99.999% is better." I am aware of thermal limitations but do not know if such is the reason for the 23%. Should I assume the remaining 77% of the time the CPU is twiddling its thumbs? As for my system Task Manager at this moment shows 0% for almost all processes and a quick guess of the sum of the others is as reported id est 30%.
Kind Regards
|
|
|
|
|
BernardIE5317 wrote: Yes I am thinking in terms of "if 23% is good 99.999% is better."
That's describing a 4-core system where one process running at 100%, and whoever wrote that program didn't bother to try to write it as multi-threaded. It may or may not be possible to do that. Or maybe it was decided it was just not worth it, given the overall time expected for the task to complete, vs the complexity involved in writing a well-behaved multi-threaded application.
|
|
|
|
|
Software is hard. Good software is harder. Good parallel software is harder than that.
If you want high CPU loads that mostly represent useful work, you have to do a lot of work to figure out how to avoid system calls, memory allocation, and even random accesses into gigantic memory maps, as well as efficient communication among the threads. Any inefficiency in any of these factors can slow down processing to the extent that the multiple threads don't proceed much faster than a single thread would.
|
|
|
|
|
Indeed. That's why I've been saying all along that making some software multi-threaded isn't something you get for free, and many developers will forego the benefits, unless there's significant, measurable gains to be had.
In other words...stop worrying about processes not pinning your CPU at 100%. In fact, that is when you should start worrying about what's going on...
|
|
|
|
|
Many operations on the motherboard do not require direct CPU utilization. Reading/Writing to disk files (HD drives have rotational delays, SSDs have bandwidth limitations depending on read or write, Pulling data from the Internet, moving data to and from your video card.) Many use hardware DMA (Direct Memory Access) to move the data and the CPU has to go idle while the transfers are taking place. Another area that can cause CPU to idle is if you exceed your physical memory and start using virtual memory. The the operating system gets involved in swapping data to and from disk and memory to give you the illusion of more main memory. Some opcode level instructions don't like it when the data they are referencing is beyond a certain physical distance. This can cause stalls in the pre-execution decoding that the CPU does and cause flushes of opcodes that have been decoded and stacked in the execution pipeline. Same goes with generated code that has an abundance of branches. Branch prediction can falter and cause CPU stalls as it has to reload the pipeline with opcodes from the target location.
Depends on what your computer is processing.
Are you experiencing slow response (stuttering pointer movement, keyboard lag)?
That's all I can think of "off the top of my head" I am sure there are more reasons for low CPU utilization.
|
|
|
|
|
honey the codewitch wrote: Add 2 + 2 using two threads to divide the task between two cores. You can't.
What about?
1+1=A
1+1=B
A+B=Answer
(Seems like a joke but I worked with a VP that seemed to think throwing threads at a problem would always solve everything.)
|
|
|
|
|
Hehe
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
There's another weird bit where some chips can't run all cores at full speed. I think the point is to have some low-power efficiency cores to dedicate to certain kinds of tasks.
Not real sure how common that is and it's a newer deal.
|
|
|
|
|
Actually it's getting more common. Both Intel and ARM chips use multiple different class of core in their CPU.
Intel uses two, and calls them "p-cores" (performance cores) and "e-cores" (efficient? core)
The reason is heat, size and power consumption vs usage habits.
The idea is that people don't use each core the same way. This way you have more powerful cores that kick in while needed, but you can run things off the e-core(s) most of the time
ARM pioneered it** because phone advancements made it almost necessary. Intel caught on to what ARM was doing and was like "Excellent! I'll take four!"
** they may not technically be the first - i don't know, but they're the first major modern CPU vendor I've seen that does it.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
That all seems to mesh very well with my reality and understandings.
I've just not had the expendable moola to grab new silicon in a bit and keeping Intel offerings straight in one's head is an exercise in futility.
Nothing against AMD (though ARM, because of its sordid history with Windows can kick rocks). I just like Intel because it's literally all I've ever had and there's just a degree of comfort/security (likely a false sense).
I have built machines for others with AMD Ryzen though and they seem to have worked out just fine.
All this does seem to make the bit of calculating total processor usage a bit more of a complex algorithm though, if not a near-useless metric in context?
|
|
|
|
|
Most of those metrics are useless by themselves. As hardware gets more complicated, so do the numbers, and the circumstances we find those numbers presenting themselves in.
I've found if I want to bench a system, I find what other people are using to bench, and then I bench my own using a baseline. The ones I use right now are:
Running Cyberpunk Bench (DLSS and Raytracing benchmark)
Running TimeSpy (General DirectX 12 and CPU gaming perf bench)
And Cinebench R23 - for CPU performance
That won't tell you everything, and the first two of those benches are very gaming oriented, and focus on GPU performance. What running them tells me is that my desktop and laptop are pretty comparable at the resolutions I play at on each, but my lappy slightly beats my desktop in multicore performance
What I'd like is for other people to compile the same large C++ codebase i am on other machines, which would give me a nice real world metric for the purpose i built this machine for (C++ compile times)
As it is, I would buy an AMD laptop (power efficiency), but intel is my go to for desktops at this point, primarily due to general purpose single core performance. My laptop is also an intel, but if i bought again, I'd have waited for the AMD version of it and got better battery life.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
modified 8-Feb-24 11:21am.
|
|
|
|
|
I can tell you that if you aren't already, making sure all the related bits ride on SSD would be one of the most huge things I think might speed up linkage.
|
|
|
|
|
That's why I run two Samsung 990 Pro NVMe drives - fastest on the market.
I also run my ram at 6000MHz/CL32 - stock spec for intel on DDR5 is like 4800 or something.
My CPU on this machine is an i5-13600K. I would have gone with the i9-13900K but I built this to be an air cooled system, and 250W was too rich for my blood - at least with this cooler - and this i5 is a sleeper with single core performance comparable to the i9. I have the i9-13900HX in my laptop - which is basically the same as the desktop version but retargeted to 180W instead of 250W.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
What you may want to try regarding the RAM...
It's a real PITA, because you'll lock/bluescreen your machine, but rather than aiming for just the highest clock-rate possible, try to tighten the latency timings or look for some sticks with the most excellent latency timings.
These things tend to be somewhat inversely related (clock speed : CASL/others).
I won't be so upset I just got a corptop with an i5 then (coming from an i7).
|
|
|
|
|
I don't play the silicon lottery, because I've lost too much time to intermittent RAM failures.
I use an XMP profile because the stick was tested at those timings. and CL32 is pretty tight.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|