LiteSpeed 4.2 + SuExec Daemon + CloudLinux + APC = 503 errors

#1
Having some serious troubles with the new updates. I wanted to switch over to APC & get everything squared away with the new version of LiteSpeed so we could see what kind of performance gains are in store. Unfortunately, we can't get 503 errors to stop running rampant after enabling APC.

We have tried all sorts of APC settings, LiteSpeed settings, disabling other PHP modules, etc. Nothing has worked. If APC is enabled, clients see 503 errors. This happens on heavily loaded systems as well as lightly loaded systems. After 1-2 page requests on a site, 90% of the page requests from that point forth are 503 errors.

I've tried -huge- memory settings for APC and for LSPHP. I simply can't get it to work, no matter the configuration I use. I've had to disable all opcode caching for now because of the 503 errors. Is there something I'm missing?

Here is a link to our LiteSpeed configuration (this is for PHP 5.2.17, but this same error occurs with 5.3.18 on all three of the systems we've tried it on):
https://dl.dropbox.com/u/158959/litespeed1.png

PHPInfo is at the link below, but APC is disabled for now for obvious reasons.
http://atlasmail.theservercompany.com/~atlas/phpinfo.php

The 503 errors show "Connection reset by peer!" when attempting to connect to uds://...lsphp52 , nothing of any substance.

There is plenty of available RAM on the system when the 503 errors occur (>10GB), and I've tried huge settings (~16GB) for max RAM usage by LSPHP with no luck.

Any tips?
 
#3
So the solution is basically "turn off APC" I guess? I've done everything else on that list prior to making my post, even so far as to turn off APC until I can get some input elsewhere. I upgraded LiteSpeed & the LSAPI module for improved performance with opcode caching and the first recommendation is to disable opcode caching?
 

webizen

Well-Known Member
#4
You should have mentioned you tried that post already (to save everyone's time).

You can run it(with APC enabled) from command line (like /usr/local/lsws/fcgi-bin/lsphp52 -v) and see any error messages. Also, in lsphp52 ext app config, lower mem limit to 1-1.5gb, increase proc limit to 200.
 
Last edited:
#5
I did basically explain everything in that post in my original post in the second paragraph. Command line PHP works just fine during the same time the 503 errors are popping up. The short of it is APC makes PHP 503 (and not 100% of the time, but -most- of the time), and without APC it works. I've tried up & down on the memory settings & connection settings with no luck (in LiteSpeed), and I've tried all sorts of APC settings, as well. This is across all of my systems, of which most are set far lower than the RAM settings you see in that screen shot. I suppose I'll try to figure this out on my own since I don't want to waste anyone's time.
 

Sindre

Well-Known Member
#7
I am struggling with the exact same problem.

LiteSpeed V6.3
PHP 5.4.19
CloudLinux
APC 3.1.13

Most of the time it works just fine, but some users report occasional 503 errors. I have also noticed this myself.. At one time "domain.com" loads just fine, then trying "www.domain.com" throws 503 error. Restarting LS solves the problem.

I have set CloudLinux and LS limits high so that is not the problem. Nothing like max entry processes or the obvious.

I have narrowed the issue down to be either APC, suEXEC daemon mode or a combination of the two.

It is extremely frustrating but very hard to reproduce, as it only happens occasionally.
 

mistwang

LiteSpeed Staff
#8
Please check stderr.log see if you get some entries like:

Child process with pid: xxxx was killed by signal: 11, core dump: 0 (or 1)
signal 11 is segfault, means PHP crashed with memory related problem.

add LSAPI_ALLOW_CORE_DUMP env to lsphp5 extapp see if core dump can be created, "core dump: 1" will be logged. you can try to figure it out with core dump or submit a bug report to corresponding developers.
 

Sindre

Well-Known Member
#9
Please check stderr.log see if you get some entries like:



signal 11 is segfault, means PHP crashed with memory related problem.

add LSAPI_ALLOW_CORE_DUMP env to lsphp5 extapp see if core dump can be created, "core dump: 1" will be logged. you can try to figure it out with core dump or submit a bug report to corresponding developers.
Thanks. I see both some signal 15 and signal 11 (a lot of these):

Code:
2013-09-04 14:15:40.004 [STDERR] Child process with pid: 1002981 was killed by signal: 15, core dump: 0
2
2013-09-04 05:13:57.688 [STDERR] Child process with pid: 852318 was killed by signal: 11, core dump: 0
I also notice a lot of these:

Code:
2013-09-04 09:22:35.664 [STDERR] which: no shieldcc in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:22:35.667 [STDERR] which: no portsentry in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:22:35.670 [STDERR] which: no snort in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:22:35.673 [STDERR] which: no ossec in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:22:35.675 [STDERR] which: no lidsadm in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:22:35.678 [STDERR] which: no tcplodg in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:22:35.681 [STDERR] which: no tripwire in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:22:35.684 [STDERR] which: no sxid in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:22:35.686 [STDERR] which: no logcheck in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:22:35.689 [STDERR] which: no logwatch in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:23:37.748 [STDERR] which: no gcc in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:23:37.763 [STDERR] which: no cc in  (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:23:37.766 [STDERR] which: no ld in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:23:37.808 [STDERR] which: no ruby in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:23:37.830 [STDERR] which: no nc in (/bin:/usr/bin:/usr/local/bin)
2013-09-04 09:23:37.844 [STDERR] which: no locate in (/bin:/usr/bin:/usr/local/bin)
There are a lot more of these "which: no xxx in...". I have never seen these before. What are they? Looks a little suspicious.
 

Sindre

Well-Known Member
#11
I have confirmed I get the 503 even when APC is disabled for the particular site.

Is it possible that when a user hits the CloudLinux max CPU limit, LiteSpeed throws a 503 error? I know the way it is supposed to work is that the site should only slow down, and CloudLinux support confirmed that an error would never occur, but I am not totally convinced. When I check LVE history the accounts do reach the max CPU allowance around the same time the 503 error occurs.

mistwang: can you confirm that maxing out the CPU limit will not cause 503 errors from LiteSpeed?
 

mistwang

LiteSpeed Staff
#14
that's due to process limit of LVE being reached, not CPU limit. that is still pending due to the way Litespeed communicates with CL LVE.

If you can reliably reproduce this issue, you can try to strace the PHP process processing the request, since it run for a while, it is not difficult to locate that process from "ps" output.
 

Sindre

Well-Known Member
#15
that's due to process limit of LVE being reached, not CPU limit. that is still pending due to the way Litespeed communicates with CL LVE.

If you can reliably reproduce this issue, you can try to strace the PHP process processing the request, since it run for a while, it is not difficult to locate that process from "ps" output.
Well, it does specifically mention CPU limit as well:

"The main problem here is that with a 503 error, that currently happens when a user uses too much CPU or too many entry processes, lsws restarts automaticaly."

And on the second page:

"I've just had a couple of litespeed restarts because of an account going over it's CPU limit (Cloudlinux gives an error with the user's queue waiting for CPU gets too long - an error that their apache module detects and throughs a 508 serveur busy page)."
 

mistwang

LiteSpeed Staff
#16
Unless it is from Igor@CL, I wont take this comment seriously.

Whenever LVE return an error, it will be logged into stderr log like

Pid (xxxx): enter LVE (xxxx) : ressult: xxxx
You can check /usr/local/apache/logs/stderr.log

Your 503 error is likely due to PHP crash for some reason. It is not hard to figure out with strace and gdb if you can reliably reproduce it.
 
#17
I've now spent countless hours trying to get some form of opcode cache installed. I've tried APC, Xcache, eAccelerator and now Zend Opcache. Every time I am met with constant 503 errors. I've searched the web and these forums up and down, I've also searched through the CloudLinux forums.

For one, CloudLinux will cause LiteSpeed to trigger a 503 error when a client reaches their resource restrictions. Mostly this is related to entry process limits, we had to disable memory limits for testing of the opcode cache anyways.. So, in light of this, I disabled automatic restart on 503 errors in LiteSpeed. It was restarting pretty much every 3-5 minutes because of this particular issue. This in itself is confusing, but anyways, on to the matter at hand..

I have tried setting memory limits across the board inside LiteSpeed to 60GB, I've also tried sane limits. I've tried limits that work without any opcode cache installed. I've tried disabling suhosin. I've tried increasing process limits in LiteSpeed, I've configured LSPHP for max connections = PHP_LSAPI_CHILDREN and instances = 1, I have SuExec Daemon in use, and aside from changing these settings (other than to increase max connections & php_lsapi_children) I have tried pretty much everything imaginable as far as LiteSpeed configuration is concerned.

I've tried high memory settings, regular memory settings, and low memory settings, and nothing allows an opcode cache to work for us without constantly throwing 503 errors. Keep in mind we have disabled all memory restriction in CloudLinux (we did this after a few other failed attempts at enabling opcode cache on the servers, it helped a little bit). Maybe the opcode cache doesn't play nice with other plugins, but I've tried 4 different opcode caches with the same results at this point, so I don't believe it is a compatibility issue..

I need this to work. If someone can shed any insight on the issue, I would be grateful.
 

NiteWave

Administrator
#18
is the server for shared hosting, run cPanel or dedicated server ?
what php extension used ?

you can enable core dump of lsphp so can debug it further.
following is lsapi environment document:

Configuration

There are a few environment variables that can be tweaked to control the behavior of LSAPI application.

LSAPI_CHILDREN or PHP_LSAPI_CHILDREN (default: 0)
There are two ways to let PHP handle multiple requests concurrently, Server Managed Mode and Self Managed Mode. In Server Managed Mode, LiteSpeed web server dynamically spawn/stop PHP processes, in this mode "Instances" should match "Max Connections" configuration for PHP external application. To start PHP in Self Managed Mode, "Instances" should be set to "1", while "LSAPI_CHILDREN" environment variable should be set to match the value of "Max Connections" and >1. Web Server will start one PHP process, this process will start/stop children PHP processes dynamically based on on demand. If "LSAPI_CHILDREN" <=1, PHP will be started in server managed mode.
Self Managed Mode is preferred because all PHP processes can share one shared memory block for the opcode cache.
Usually, there is no need to set value of LSAPI_CHILDREN over 100 in most server environment.

LSAPI_AVOID_FORK (default: 0)
LSAPI_AVOID_FORK specifies the policy of the internal process manager in "Self Managed Mode". When set to 0, the internal process manager will stop and start children process on demand to save system resource. This is preferred in a shared hosting environment. When set to 1, the internal process manager will try to avoid freqently stopping and starting children process. This might be preferred in a dedicate hosting environment.

LSAPI_EXTRA_CHILDREN (default: 1/3 of LSAPI_CHILDREN or 0)
LSAPI_EXTRA_CHILDREN controls the maximum number of extra children processes can be started when some or all existing children processes are in malfunctioning state. Total number of children processes will be reduced to LSAPI_CHILDREN level as soon as service is back to normal. When LSAPI_AVOID_FORK is set to 0, the default value is 1/3 of LSAPI_CHIDLREN, When LSAPI_AVOID_FORK is set to 1, the default value is 0.

LSAPI_MAX_REQS or PHP_LSAPI_MAX_REQUESTS (default value: 10000)
This controls how many requests each child process will handle before it exits automatically. Several PHP functions have been identified having memory leaks. This parameter can help reducing memory usage of leaky PHP functions.

LSAPI_MAX_IDLE (default value: 300 seconds)
In Self Managed Mode, LSAPI_MAX_IDLE controls how long a idle child process will wait for a new request before it exits. This option help releasing system resources taken by idle processes.

LSAPI_MAX_IDLE_CHILDREN (default value: 1/3 of LSAPI_CHILDREN or LSAPI_CHILDREN)
In Self Managed Mode, LSAI_MAX_IDLE_CHILDREN controls how many idle children processes are allowed. Excessive idle children processes will be killed by the parent process immediately. When LSAPI_AVOID_FORK is set to 0, the default value is 1/3 of LSAPI_CHIDLREN, When LSAPI_AVOID_FORK is set to 1, the default value is LSAPI_CHILDREN.

LSAPI_MAX_PROCESS_TIME (default value: 300 seconds)
In Self Managed Mode, LSAPI_MAX_PROCESS_TIME controls the maximum processing time allowed when processing a request. If a child process can not finish processing of a request in the given time period, it will be killed by the parent process. This option can help getting rid of dead or runaway child process.

LSAPI_PGRP_MAX_IDLE (default value: FOREVER )
In Self Managed Mode, LSAPI_PGRP_MAX_IDLE controls how long the parent process will wait before exiting when there is no child process. This option help releasing system resources taken by an idle parent process.

LSAPI_PPID_NO_CHECK
By default a LSAPI application check the existence of its parent process and exits automatically if the parent process died. This is to reduce orphan process when web server is restarted. However, it is desireable to disable this feature, such as when a LSAPI process was started manually from command line. LSAPI_PPID_NO_CHECK should be set when you want to disable the checking of existence of parent process. When PHP started by "-b" option, it is disabled automatically.

LSAPI_ALLOW_CORE_DUMP
By default a LSAPI application will not leave a core dump file when crashed. If you want to have LSAPI PHP dump a core file, you should set this environment variable. If set, regardless the value has been set to, core files will be created under the directory that the PHP script in.

LSAPI_ACCEPT_NOTIFY
By default a LSAPI application will send back a notification packet whenever a request has been received. Since PHP LSAPI 5.0 release, it can be changed to only notify server for newly established connection by setting this option to '1'. It is recommended for LiteSpeed Web Server 4.1 release and later to gain more performance.

LSAPI_SLOW_REQ_MSECS
By setting it to a non-zero number, LiteSpeed web server will log requests into standard error log file if a request takes longer than the specified milliseconds. It can help identify the script that slows down your server.
 
#19
Yes, shared hosting dedicated server with cPanel and CloudLinux. I guess the next time I want to spend a few hours cleaning up core dumps afterwards I will try what you have suggested.
 

wanah

Well-Known Member
#20
We tried APC some time ago and got 503 errors, same with xCache and even some (but fewer) with zend opcache.

We've also had 503 errors and litespeed restarting when CloudLinux gives off an error because of either maximum processes being hit or when CPU goes well over the users limit (when the CPU queue gets too long for that user).

Most often the users process limit's are hit first but we have increased ours and in some cases users can trigger a 503 error without hitting their process limit. The issue here is not that there is an error as CloudLinux is designed to give of an error in these situations, but that Litespeed doesn't recognise this error, and restarts itself.

For the Opcache issues. From playing around with the different cache systems, it seems that too much cache slows things down and can cause PHP to crash when it takes too long to empty the cache pool.

However if you have a too small a cache then you don't gain much either as on a server with a large number of sites you end up with not having the same pages and emptying the cache all the time.

We're going to give Eaccelerator a trial, but last time PHP refused to load completly when it was enabled, guess a setting was needed in cloudlinux cagefs that we missed
 
Top