Hung Process Help

GOT · Sep 24, 2016

I have over 100 Litespeed lcienses so this isn't my first rodeo, but I have this one setup that has given me fits for months and I would like another set of eyeballs on it.

We have two dedicated servers loadbalanced running a single WordPress site with a third as the database server.

What we are seeing is that index.php at times (once or twice a day worst case, couple times a week on average) gets hung up and Litespeed stops handling new requests. Process looks like this

11899 ? S 0:05 litespeed (lshttpd)
11903 ? S 0:00 \_ httpd (lscgid)
11904 ? Sl 1:00 \_ litespeed (lshttpd)
1379 ? S 0:00 | \_ lsphp5
1387 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1490 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1529 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1542 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1556 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1560 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1564 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1610 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1622 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1710 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1758 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1771 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1772 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
1773 ? S 0:00 | \_ lsphp5:/home/ocfcom/public_html/index.php
11905 ? Sl 0:26 \_ litespeed (lshttpd)

They're running the latest version and are both 2 CPU licenses.

Here are our external app settings

http://d.pr/i/18oQO

You can see from there that I have escalated the php processes to a staggering 1500, though that seems to be getting ignored.

I'd appreciate any feedback

wanah · Sep 24, 2016

When this happens can you access php files on that account that don't contain mysql commands (like a simple <?php echo 'hello world' ; ) Can you still access static files ? You could also enable logging and filter by your IP and send the result to litespeeds bugs e-mail address to see if they can see anything wrong.

Lauren · Sep 24, 2016

The ideal set up would be using LSLB (LiteSpeed Load Balancer) + 2 LSWS, and Cache is on LSLB. Can you WP site use LSCache plugin? If 2 LSWS sit together with LSLB, you can use LSLB for HTTPS termination. With Cache served from load balancer side, you should see big improvement.

If you need us to check your server, you can create a ticket from your acct and provide login.

GOT · Sep 24, 2016

We're not using LSLB in this case. DNSMadeEasy with fail-over. That works well.

Did you look at the settings? See anything obvious in there?

Why is there so few php processes? Shouldn't there be a LOT more?

I'm not the designer of the site, so I would have to talk to them about the cache plugin.

NiteWave · Sep 25, 2016

wanah said:

yes, this is what we want to know first.

GOT said:

new php process will be stared on demand.
if there are enough idle php processes available, no need start a new php process.

GOT · Sep 27, 2016

We had another event this morning and I have confirmed that non wordpress php files are still accessible, its just the wordpress that won't load.

GOT · Sep 28, 2016

Anyone have anything for me on this?

mistwang · Sep 28, 2016

If both servers hung at the same time, it is likely something wrong with MySQL DB server.
If one server hung, another one is fine, it could be a problem of dead lock in PHP opcode cache.

What you can do is to strace some PHP processes when server hung, to find out the source of the hung.

GOT · Sep 28, 2016

Its definitely one server at a time. I had an incident today and I checked the database and it was fine, there were not hung queiries. I also found that the processes were not actually hung. They were being terminated but another filling its place almost immediately. I tried doing a trace, but each time, the process was already gone before I could put a trace on it.

We are not using any opcode caching that I can find. Only think I see is that apc is loaded but disabled:

apc
APC Support => disabled

mistwang · Sep 28, 2016

Anything in error_log and stderr.log?

GOT · Oct 3, 2016

Well, I saw apc mentioned in the logs a numer of times and this weekend we disabled apc and did not have a single instance.

Is there something I can do/modify to make this more stable? Or is APC just inherently a problem?

mistwang · Oct 3, 2016

you can try a different opcode cache, opcache, xcache. etc.

GOT · Oct 3, 2016

Well, the problem is that they are using it mostly for its ability to cache database query data not as an opcacher.

If you don't know of any way to keep apc from deadlocking under Litespeed, I'll have to discuss with my client.

mistwang · Oct 3, 2016

APC deadlocking is APC internal bug, not directly related to LiteSpeed.
Xcache has data cache as well. or use memcached.

GOT · Oct 3, 2016

I think the plugin they are using is geared toward APC. Would APCu have the same issues?

mistwang · Oct 3, 2016

You can give it a try. they are based on same code base, APCu is still maintained by developers.

Hung Process Help

GOT

Well-Known Member

wanah

Well-Known Member

Lauren

LiteSpeed Staff

GOT

Well-Known Member

NiteWave

Administrator

GOT

Well-Known Member

GOT

Well-Known Member

mistwang

LiteSpeed Staff

GOT

Well-Known Member

mistwang

LiteSpeed Staff

GOT

Well-Known Member

mistwang

LiteSpeed Staff

GOT

Well-Known Member

mistwang

LiteSpeed Staff

GOT

Well-Known Member

mistwang

LiteSpeed Staff