Troubleshooting 503 Errors with LiteSpeed Web Server

This is an old revision of the document!

503 errors are often caused by a malfunction in PHP. This wiki will go over basic steps to troubleshoot 503 errors, some common causes of 503 errors, and some examples that show the steps in practice.

For native LSWS setups, the error log is usually named error.log. It may be named error_log if you are using WHM/cPanel.

The error log helps you identify what kind of problem is occurring and when the problem is occurring.

Here is an example of LiteSpeed Web Server error log entries showing a 503 error:

2014-01-28 10:05:51.751 [INFO] [lsphp5] PID: 45143, add child process pid: 45175, procinfo: 0x15656f0
2014-01-28 10:05:51.790 [INFO] Remove pid: 45175, exitcode: 255
2014-01-28 10:05:51.790 [INFO] [66.249.64.1:14152-0#APVH_lsws.com] connection to [/tmp/lshttpd/lsphp5.sock] on request #0, confirmed, 0, associated process: -1, running: 0, error: Connection reset by peer!
2014-01-28 10:05:51.791 [INFO] [lsphp5] PID: 45143, add child process pid: 45177, procinfo: 0x15656f0
2014-01-28 10:05:51.829 [INFO] Remove pid: 45177, exitcode: 255
2014-01-28 10:05:51.829 [INFO] [66.249.64.1:14152-0#APVH_lsws.com] connection to [/tmp/lshttpd/lsphp5.sock] on request #0, confirmed, 0, associated process: -1, running: 0, error: Connection reset by peer!
2014-01-28 10:05:51.831 [INFO] [lsphp5] PID: 45143, add child process pid: 45179, procinfo: 0x15656f0
2014-01-28 10:05:51.869 [INFO] Remove pid: 45179, exitcode: 255
2014-01-28 10:05:51.869 [INFO] [66.249.64.1:14152-0#APVH_lsws.com] connection to [/tmp/lshttpd/lsphp5.sock] on request #0, confirmed, 0, associated process: -1, running: 0, error: Connection reset by peer!
2014-01-28 10:05:51.869 [NOTICE] [66.249.64.1:14152-0#APVH_lsws.com] Max retries has been reached, 503!
2014-01-28 10:05:51.869 [NOTICE] [66.249.64.1:14152-0#APVH_lsws.com] oops! 503 Service Unavailable^M

Interpreting these entries:

The time stamp - We can use this time stamp later to find related entries in other logs.
“Remove pid: 45175, exitcode: 255” - In this log line, we see the PID of the process that died as well as an exit code. This also tells us that the process did not crash. If it had crashed, instead of an exit code, we would have seen an entry telling us the process had been “killed by signal”.
“error: Connection reset by peer!” - This line tells you that an error just occurred.
“confirmed” - The confirmed value tells you whether, in the error reported, the PHP process took the request. Possible values are 0 and 1. 0 means that the PHP process died before taking the request. This suggests that it ran into the problem while parsing the php.ini.
“associated process” - The associated process value gives you a PID. You can use this PID to find the process with a problem. If the value is -1, then LSWS does not have its PID, probably because it died before taking the request.
“Max retries has been reached, 503!” - In most cases, LSWS will retry when it gets an error. After three “Connection reset by peer!” errors, LSWS will then return a 503 error page. This rule of three retries generally holds true. With a POST request, however, if the confirmed value is 1 (meaning the error occurred after the request was taken), LSWS will return a 503 after one try.

From these entries, we can tell that the above error did not cause a crash. We know what time the error occurred and to which process. We also know that the problem is probably in the php.ini, since the error occurred before the process took the request.

The stderr.log logs errors from the standard error stream. This log can give you additional information about errors that occurred. Using the time stamps and PIDs you've gotten from the error log, you may be able to find relevant errors in the stderr.log. The stderr.log can usually be found in the same directory as the error log.

Here are some sample entries we might find in stderr.log:

2014-01-28 10:05:51.789 [STDERR] PHP Fatal error:  Directive 'register_globals' is no longer available in PHP in Unknown on line 0
2014-01-28 10:05:51.828 [STDERR] PHP Fatal error:  Directive 'register_globals' is no longer available in PHP in Unknown on line 0
2014-01-28 10:05:51.867 [STDERR] PHP Fatal error:  Directive 'register_globals' is no longer available in PHP in Unknown on line 0

In the entries above, the error is caused by the directive register_globals. We need only to remove that directive.

If you cannot find errors in stderr.log, you may have to resort to testing common causes of PHP errors (addressed below).

If you find that PHP has crashed (as demonstrated by the process being “killed by signal”), a core dump will allow you to look further into the cause of the crash.

To enable a core dump, add the environment value

LSAPI_ALLOW_CORE_DUMP=1

to your external application settings (WebAdmin console > Configuration > External App). Next time the application crashes, a core dump will be generated. The core file created can usually be found in the directory holding the PHP script affected.

Your Redhat/Centos system may already use ABRT(Automatic Bug Reporting Tool) tool to generate core dump files. In this case, core files may be located at /var/spool/abrt/. ABRT configuration files are normally located at /etc/abrt/. Please ensure the setting “ProcessUnpackaged = yes” in /etc/abrt/abrt-action-save-package-data.conf to enable the core dump, otherwise core file may not be generated even the web server error log says so. If you want to disable core file dump after the debugging, simply change “ProcessUnpackaged” setting back to “no”.

core_pattern is used to specify a core dumpfile pattern name.

cat /proc/sys/kernel/core_pattern

which shows you the pattern template for the output filename.

If the first character of the pattern is a '|', the kernel will treat the rest of the pattern as a command to run. The core dump will be written to the standard input of that program instead of to a file.

By default, /proc/sys/kernel/core_pattern contains core string and kernel produces core.* files in crashed process's current directory.

Abrt’s C/C++ hook overrides this with:

|/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e

which results in kernel calling abrt-hook-ccpp

Note: As noted below, opcode caches are frequently a cause of PHP crashes. If you find your PHP crashed, you may want to try turning off any opcode caching you have. This is addressed further below.

GNU Debugger (GDB) uses the syntax gdb <path/to/lsphp/binary> <path/to/core/file>. (Your LSPHP binary can usually be found in /usr/local/lsws/fcgi-bin. If you installed LSPHP through LiteSpeed Repository, LSPHP binary file usually locates in /usr/local/lsws/lsphp5x/lsphp('x' means PHP version of '3' '4' '5' or '6') )

Once you have opened the core file with GDB, use bt command to print a backtrace of the stack (the steps that the process application took coming to the crash). This will often reveal what caused the crash.

Here is a sample backtrace from a program that crashed:

Program received signal SIGSEGV, Segmentation fault.
0x000000000061c91b in ?? ()
(gdb) bt
#0 0x000000000061c91b in ?? ()
#1 0x0000000000641ac3 in zend_stack_push ()
#2 0x000000000060bbf9 in ?? ()
#3 0x0000000000611515 in lex_scan ()
#4 0x000000000061fb60 in ?? ()
#5 0x0000000000608223 in ?? ()
#6 0x0000000000614965 in compile_file ()
#7 0x00007fd3c99eda21 in ?? () from /opt/alt/php55/usr/lib64/php/modules/phar.so
#8 0x00007fd3d0993359 in ?? () from /opt/alt/php55/usr/lib64/php/modules/opcache.so
#9 0x00007fd3d0994187 in ?? () from /opt/alt/php55/usr/lib64/php/modules/opcache.so
#10 0x00000000006b54da in ?? ()
#11 0x00000000006b68e8 in execute_ex ()
#12 0x000000000064239c in zend_execute_scripts ()
#13 0x00000000005e2fb0 in php_execute_script ()
#14 0x00000000006f29ff in ?? ()
#15 0x00000000006f2c5c in ?? ()
#16 0x00000000006f2f85 in ?? ()
#17 0x00007fd3d37cdd1d in __libc_start_main () from /lib64/libc.so.6
#18 0x0000000000424bc9 in _start ()

This backtrace shows us that the program crashed soon after accessing the opcode cache, suggesting that the issue is with the opcode cache. The user can then try upgrading the opcode cache or changing their PHP version, or disabling opcode cache if neither of those work.

Faulty configurations or directives in your php.ini can cause a fatal error and make the process exit right at the beginning.

php.ini problems generally show an “error: Connection reset by peer!” error and have a confirmed value of 0, meaning that the process never took the request. These errors will often be explained in your stderr.log. They may require that you comment out bad directives or fix faulty configurations.

PHP will return an error if one of your modules uses an API that does not match your PHP version. The following is an example of an error shown when there is a binary-module mismatch:

Warning: PHP Startup: imap: Unable to initialize module
Module compiled with module API=20090626
PHP    compiled with module API=20100525
These options need to match in Unknown on line 0

To fix this, you will need to either rebuild the module and/or PHP, making sure that you are using a version of PHP that works with the module. Also, make sure that the correct extension path is used.

The different opcode caches often have compatibility issues with PHP. These issues may be uncovered when looking through the backtrace of a PHP crash. Often, if you are getting a 503 error, and especially if you see PHP crashing, it may be a good idea to try turning off your opcode cache to see if it solves the problem.

To turn off opcode caching, look for a line with “apc.so”, “xcache.so”, or “eaccelerator.so” in your php.ini files and comment it out. Restart LSWS and try the page in question again.

Note that only one opcode cache can be loaded at a time. They don't work together.

If you find that the opcode cache is causing the error, you can try upgrading your version of the opcode cache or using a different version of PHP. If that does not work, disable the opcode cache and possibly try a different opcode cache. You may also want to submit a bug report to the opcode cache developer.

Third part modules also often have compatibility issues as well. To turn off the modules, comment out the line with the extension in your php.ini files. Restart LSWS and try the page in question again.

Just as with opcode caches, if you find that a third party module is causing the error, you can try upgrading your version of the module or using a different version of PHP. If that does not work, disable the module and consider submitting a bug report to the module's developer.

Sometimes, the module loading order makes a difference. Shuffling the order that modules are listed in your php.ini has been known to fix issues.

zend_extension is used for Zend's own extensions, such as frameworks or optimizers (like ionCube, ZendGuardLoader, or ZendOptimizer). extension is for everything else, such as PEAR, PECL, etc.

For Zend's own extensions, the syntax zend_extension=/path/to/extensions.so must be used in the php.ini.

For other extensions, a number of different possibilities might work:

Try zend_extension=/full/path/to/extension.so
Try extension=extension.so. “extension.so” is the extension to be loaded, such as pdo.so, or pdo_mysql.so. When you use this syntax, make sure extension_dir is defined in the php.ini with the full path to the directory where all the extensions are found.
Comment out the extension_dir line to let PHP pick a default.

when php script is executing, if the process is killed by admin or a process monitoring daemon, it'll simply result 503 error.

An Example:

A WHM/cPanel server, installed a plugin “ConfigServer Security & Firewall”(i.e.,CSF / LFD), it killed lsphp5 process from time to time and result 503 error.

/etc/csf/csf.conf :

# This User Process Tracking option sends an alert if any cPanel user process
# exceeds the time usage set (seconds). To ignore specific processes or users
# use csf.pignore
#
# Set to 0 to disable this feature
PT_USERTIME = "1800"

so if lsphp5 process has run 1800 seconds(30 minutes), it might be caught by csf and killed. in php suExec Daemon mode or ProcessGroup mode, it's normal that the parent lsphp5 process keep running over 30 minutes. when csf/lfd kill lsphp5 process, it'll leave logs in /var/log/lfd.log, like

Jun 19 16:29:16 evo lfd[18304]: *User Processing* PID:18264 Kill:1 User:xxxxx VM:538(MB) EXE:/usr/local/lsws/fcgi-bin/lsphp-5.4.42 CMD:lsphp5

the time stamp match 503 error in /usr/local/apache/logs/error_log:

2015-06-19 16:29:16.370 [NOTICE] [173.245.50.197:61317-0#APVH_pingje.org] oops! 503 Service Unavailable

the fix:

add

pexe:/usr/local/lsws/fcgi-bin/lsphp.*

to end of /etc/csf/csf.pignore , then restart csf / lfd

# csf -r

or in WHM,

Home » Plugins » ConfigServer Security & Firewall
lfd - Login Failure Daemon
"csf.pignore, Process Tracking" Edit "lfd ignore file"
append following line
pexe:/usr/local/lsws/fcgi-bin/lsphp.*
"Restart Lfd"

when define a lsphp external application, there are 2 litespeed specific settings:

Memory Soft Limit (bytes) (https://www.litespeedtech.com/docs/webserver/config/extapps#memSoftLimit)
Memory Hard Limit (bytes) (https://www.litespeedtech.com/docs/webserver/config/extapps#memHardLimit)

As stated in the document: “The main purpose of this limit is to prevent excessive memory usage because of software bugs or intentional attacks, not to impose a limit on normal usage. Make sure to leave enough head room, otherwise your application may fail and 503 error may be returned.”

so for example in a shared hosting server, with memory soft/hard limit set, if an php script in an account consumes too many memory, it'll fail(return 503 error) and other accounts are not affected. the ideal solution is to optimize the php script to consume less memory. A quick and temporary workaround is to raise the soft/hard limit to a big value(for example 8G), to see if the 503 error will be gone. here's a use case : https://www.litespeedtech.com/support/forum/threads/solved-xenforo-rebuild-attachment-thumbnails-503.12403/

Bad Directive in php.ini

The above example uses the first two troubleshooting steps — searching the error log and stderr.log.

ionCube Loader Path not Correct

The above example shows the kind of error you might get if your extension path is not correct.

Opcode Cache Crashes PHP

The above example shows the use of GDB to show that opcode cache was causing a crash.

Opcode Cache Crashes PHP 2

The above example shows another error that was solved by turning off the opcode cache.

ZenGuardLoader Bug

The above example shows a case where shutting off ZendGuardLoader solved an issue until Zend was able to fix the bug.

ionCube Wants eAccelerator Loaded as an Extension

The above example shows how switching zend_extension for extension can sometimes solve an issue.

PHP Figures Out Extension Locations by Itself

The above example shows a case where leaving extension_dir undefined solved an issue.

Troubleshooting 503 Errors with LiteSpeed Web Server

Basic Troubleshooting steps

1. Check the Error Log

Interpreting these entries:

2. Check for Corresponding Entries in stderr.log

3. Enable Core Dump (or just turn off opcode caching)

4. Analyze Core File with GNU Debugger

Common Causes of 503 Errors

1. Bad php.ini

2. PHP Binary-Module Mismatch

3. Opcode Caches (APC, xCache, EAccelerateor)

4. Third Party Modules (ZendGuardLoader, Suhosin, ionCube, etc.)

5. Module Loading Order

6. "zend_extension" Instead of "extension" and Vice Versa

7. lsphp process is killed unexpectedly

8. lsphp process hit memory limit

Real World Examples