Need some tips for tuning the web server

#1
Hi guys,

We are currently using LSWS to power a number of Wordpress installations on a 2GB VPS slice. The VPS slice is dedicated for serving LSWS requests (so MySQL is served remotely), LSWS is the latest (4.0.10 Enterprise), PHP is 5.2.10 with APC and Memcache enabled. The Wordpress installations all have W3 Total Cache enabled too.

However, we are still seeing relatively high server load. From 'top', it appears that lsphp5 processes are using quite a bit of CPU power.

top - 00:11:46 up 2 days, 3 min, 1 user, load average: 0.62, 1.00, 1.06
Tasks: 107 total, 1 running, 106 sleeping, 0 stopped, 0 zombie
Cpu(s): 16.8%us, 5.3%sy, 0.0%ni, 73.1%id, 1.2%wa, 0.0%hi, 0.2%si, 3.3%st
Mem: 2097372k total, 978416k used, 1118956k free, 49788k buffers
Swap: 4194296k total, 44k used, 4194252k free, 540348k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
24484 apache 19 -1 146m 60m 32m S 48.6 3.0 0:05.46 lsphp5
24494 apache 19 -1 134m 32m 15m S 30.0 1.6 0:00.90 lsphp5
24495 apache 19 -1 142m 45m 20m S 18.0 2.2 0:00.54 lsphp5
24493 apache 19 -1 136m 33m 14m S 10.7 1.6 0:00.32 lsphp5
24496 apache 19 -1 133m 32m 16m S 8.7 1.6 0:00.26 lsphp5
24361 apache 1 -19 20364 5324 1492 S 1.3 0.3 0:03.06 lshttpd
1 root 20 0 10356 652 556 S 0.0 0.0 0:03.92 init
2 root 15 -5 0 0 0 S 0.0 0.0 0:00.02 kthreadd
3 root RT -5 0 0 0 S 0.0 0.0 0:01.44 migration/0
4 root 15 -5 0 0 0 S 0.0 0.0 0:00.68 ksoftirqd/0
5 root RT -5 0 0 0 S 0.0 0.0 0:01.00 watchdog/0
6 root 15 -5 0 0 0 S 0.0 0.0 0:09.24 events/0
7 root 15 -5 0 0 0 S 0.0 0.0 0:00.06 khelper
18 root 15 -5 0 0 0 S 0.0 0.0 0:00.00 xenwatch
19 root 15 -5 0 0 0 S 0.0 0.0 0:00.12 xenbus
27 root RT -5 0 0 0 S 0.0 0.0 0:00.56 migration/1
28 root 15 -5 0 0 0 S 0.0 0.0 0:00.20 ksoftirqd/1
29 root RT -5 0 0 0 S 0.0 0.0 0:00.52 watchdog/1
30 root 15 -5 0 0 0 S 0.0 0.0 0:09.34 events/1
31 root RT -5 0 0 0 S 0.0 0.0 0:00.72 migration/2
32 root 15 -5 0 0 0 S 0.0 0.0 0:00.24 ksoftirqd/2
33 root RT -5 0 0 0 S 0.0 0.0 0:00.56 watchdog/2
34 root 15 -5 0 0 0 S 0.0 0.0 0:09.14 events/2
35 root RT -5 0 0 0 S 0.0 0.0 0:00.60 migration/3
36 root 15 -5 0 0 0 S 0.0 0.0 0:00.14 ksoftirqd/3
37 root RT -5 0 0 0 S 0.0 0.0 0:00.60 watchdog/3
38 root 15 -5 0 0 0 S 0.0 0.0 0:09.14 events/3
64 root 15 -5 0 0 0 S 0.0 0.0 0:00.50 kblockd/0
Below are some of the configuration parameters:







I've tried a few different tweaks and nothing I did seem to bring the CPU usage down. :|

It is also unlikely the Wordpress installations are causing the issue, because these sites were hosted elsewhere using Apache + mod_php before, and the CPU usage was fine.

Some interesting stuff from the server log:



And the requests/second graph (during this time, server load averages at around 0.8 to 1.3):




Any suggestions?
 
Last edited:

robfrew

Well-Known Member
#3
High cpu usage stats is normal but it does not affect performance or any other areas of the server. We average between 1.8 to 3.5 during busy times and our server is blazing fast with mysql, php and many other service running. Nothing to worry about with your cpu usage.
 
#4
I see, thanks guys.

The reason for my post is that, the posted server load was taken during off peak time, and the current server only hosts a small portion of our sites. We are planning to migrate our larger sites to the same server in the upcoming weeks, and I was a bit worried that the server might not coupe very well once the larger sites are up and running. :|

I guess I'll slowly migrate those sites over and see how it goes.
 

mistwang

LiteSpeed Staff
#5
Max idle time and connection keepalive timeout should not be set. Initial request timeout should be around 60.

What you see probably is normal as LSWS uses minimum number of persistent PHP processes to serve all requests, CPU utilization may looks like for indivudal process, but the overall system level CPU utilization is lower.
 

mistwang

LiteSpeed Staff
#6
Another note on W3 Total Cache, looks like everything is still served by PHP, just cached by memcache or something. it is slower than WP super cache, if you do not need to keep everything dynamic or any specific need for features in W3 Total Cache, WP super cache is a better choice for busy blog site. Serving a static page is 100X faster than serving a PHP page.
 
#7
Thank you mistwang! I've unset max idle time and connection keepalive timeout and set the initial request timeout to 60 per your instruction. On a busy site though, wouldn't that cause unneccesarry overheads? i.e. too many idle/keep alive connections?

It's interesting that you mentioned the two Wordpress caching plugins, I am always under the impression that because memcache is caching on the memory level, it will be faster than serving static pages. I briefly benchmarked WP Super Cache and W3 Total Cache before rolling the latter one to all our Wordpress installations, and the result was in favour of W3 Total Cache. I guess I should do a more comprehensive benchmark later on, just to make sure we are on the right track. :)
 
#8
We have just experienced an unusual high load. There were lots of lsphp5 processes, I'm not sure how to tackle this issue, here's a snapshot of the processes:

 
#10
Thanks mistwang. I had looked into the possibility of a small scale DoS attack. But at the time of the high load (which sustained for about 10-15 minutes), requests per second was only about 50-60, which the server should have handled it without causing such a high load. There was no cron jobs running at the time, no errors from either the server log or PHP log.

I will investigate more. :|
 
#11
Hi mistwang,

I think I need some more help.

Our server has just had a huge load spike, lsphp5 was taking up all the CPU, and made the server unresponsive, couldn't even ping or SSH into the server.



The server log just before the huge spike (domains are masked):

2009-09-07 03:01:52.427 [INFO] [70.32.68.11:54866-0#APVH_xxx.xxx.xxx] connection to [/tmp/lshttpd/lsphp5.sock.582] on request #25, confirmed, 0, associated process: 5282, running: 1, error: Connection reset by peer!
2009-09-07 03:02:42.006 [INFO] [64.127.102.82:17873-0#APVH_xxx.xxx.xxx] Connection idle time: 11 while in state: 4 watching for event: 25,close!
2009-09-07 03:02:42.006 [NOTICE] [64.127.102.82:17873-0#APVH_xxx.xxx.xxx] Content len: 0, Request line:
2009-09-07 03:02:42.006 [NOTICE] [64.127.102.82:17873-0#APVH_xxx.xxx.xxx] Redirect: #1, URL: /blog/index.php
2009-09-07 03:05:14.699 [INFO] [62.25.109.195:4123-0#APVH_xxx.xxx.xxx] connection to [/tmp/lshttpd/lsphp5.sock.582] on request #1, confirmed, 0, associated process: 5282, running: 1, error: Connection reset by peer!
2009-09-07 03:07:17.530 [INFO] [209.249.84.5:51438-0#APVH_xxx.xxx.xxx] connection to [/tmp/lshttpd/lsphp5.sock.582] on request #8, confirmed, 0, associated process: 5282, running: 1, error: Connection reset by peer!
2009-09-07 03:08:19.849 [INFO] [119.107.82.136:52045-0#APVH_xxx.xxx.xxx] connection to [/tmp/lshttpd/lsphp5.sock.582] on request #15, confirmed, 0, associated process: 5282, running: 1, error: Connection reset by peer!
2009-09-07 03:09:56.439 [INFO] [195.88.32.160:2184-0#APVH_xxx.xxx.xxx] Connection idle time: 11 while in state: 4 watching for event: 25,close!
2009-09-07 03:09:56.439 [NOTICE] [195.88.32.160:2184-0#APVH_xxx.xxx.xxx] Content len: 555, Request line:
POST /wp-comments-post.php HTTP/1.1


Any ideas what could have caused such a dramatic issue?
 

auser

Super Moderator
#12
have you tried to disable some wordpress plugins, e.g. W3 total cache, see if any difference?

for lsphp5(pid=25210) has 320% cpu usage, can trace it to see if it's blocked by whatever system call:

~>strace -c -p 25210

run it for some seconds, then Ctrl+C, to check the output.
 
#13
Hi auser, thanks for the tips. We are actually working with the author of W3 Total Cache to solve some potential issues.

Unfortunately we can't trace that pid because the server was completely dead and we had to switch it to rescue mode and bring it back. By then the process was no longer exist.
 
#14
Looks like there wasn't enough memory allocated to lsphp5, I didn't realise APC shm size is counted towards the memory limit. I have increased the memory limit now and hopefully this problem will go away. :)
 
Top