Brendan Long writes: ""I deal with a lot of servers at work, and one thing everyone wants to know about their servers is how close they are to being at max utilization.
-
Brendan Long writes: ""I deal with a lot of servers at work, and one thing everyone wants to know about their servers is how close they are to being at max utilization. It should be easy, right? […]
And yet, whenever people actually try to project these numbers, they find that CPU utilization doesn't quite increase linearly. But how bad could it possibly be?
To answer this question, I ran a bunch of stress tests and monitored both how much work they did and what the system-reported CPU utilization was, then graphed the results.""
-
Brendan Long writes: ""I deal with a lot of servers at work, and one thing everyone wants to know about their servers is how close they are to being at max utilization. It should be easy, right? […]
And yet, whenever people actually try to project these numbers, they find that CPU utilization doesn't quite increase linearly. But how bad could it possibly be?
To answer this question, I ran a bunch of stress tests and monitored both how much work they did and what the system-reported CPU utilization was, then graphed the results.""
2/ ohh, and btw, see also this related post from @brendangregg, which was written a few years ago, but for some people likely still is news:
https://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html
See also this short video from Brendan about the topic: https://www.youtube.com/watch?v=QkcBASKLyeU
-
Brendan Long writes: ""I deal with a lot of servers at work, and one thing everyone wants to know about their servers is how close they are to being at max utilization. It should be easy, right? […]
And yet, whenever people actually try to project these numbers, they find that CPU utilization doesn't quite increase linearly. But how bad could it possibly be?
To answer this question, I ran a bunch of stress tests and monitored both how much work they did and what the system-reported CPU utilization was, then graphed the results.""
@kernellogger What about memory exhaustion and I/O bandwidth oversubscription? I'm a big fan of the "PSI" metrics:
$ ls /proc/pressure/
cpu io memory
from$ uname -a
Linux schallkreis 6.12.21 #1 SMP PREEMPT_DYNAMICcomes in 3 delicious flavors for good reason.
-
-
2/ ohh, and btw, see also this related post from @brendangregg, which was written a few years ago, but for some people likely still is news:
https://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html
See also this short video from Brendan about the topic: https://www.youtube.com/watch?v=QkcBASKLyeU
@kernellogger @brendangregg
Getting back to previous comment about I/O- and memory-related failures, note that Gregg's _System Performance_ has multiple chapters on these topics, and network too.
For example, a fantastic talk at Southern California Linux Expo a few years ago by Frits Hoogland (https://www.socallinuxexpo.org/scale/19x/speakers/frits-hoogland) highlighted how low memory can limit writeback of dirty pages and make a system very slow indeed.