Occasional 500 Internal Server Errors

Discussion in 'Bug Reports' started by alex.b, Dec 31, 2013.

  1. alex.b

    alex.b New Member

    Hi,

    Not completely sure this is a bug as it's PHP-related, but to me it looks like LiteSpeed having trouble spawning additional lsphp5 processes from time to time.

    But let me start from the beginning.

    What happens on the frontend is that a couple of times per day requests start returning "500 Internal Server Error" with "Request Timeout" message.

    In the backend, there's a lot of lsphp5 processes, but they all seem to be stuck (0 CPU usage) on a request (see below) and eventually time out. No new lsphp5 processes are spawned although there are requsts comming in - they all time out eventually and the only thing that helps is a manual lsws restart.

    Not sure what else might be of interest, but it's a VPS server (4 CPUs) and I'm hosting one live and a staging site on it. There's APC cache in use and LiteSpeed is in ProcessGroup mode - I've tried with small (7) and larger (35) number of workers per group, but it makes no difference. I'm using phpSuExec with phpSuExecMaxConn set to 35, connTimeout is at 180 now (was 120) but again, no visible difference.

    Othwerwise it works well, but I'd really like not to have to worry about the server getting stuck each day :|

    Thank you :)

    Take care,
    Alex


    The required info:

    [1] OS type and version: CENTOS 6.5 x86_64
    [2] LiteSpeed version: LiteSpeed Web Server/Enterprise/4.2.6
    [3] PHP version and interface of choice: 5.4.22 LSAPI
    [4] Other related info:
    ---
    PHP Processes in the "stuck" mode:
    Code:
    undercur  3401  0.0  0.9 353600 20484 ?        S    00:53   0:00 lsphp5                         
    undercur  3542  0.0  0.3 354612  7620 ?        S    00:54   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3543  0.0  0.3 354612  7624 ?        S    00:54   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3545  0.0  0.3 354612  7624 ?        S    00:54   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3548  0.0  0.3 354612  7656 ?        S    00:54   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3571  0.0  0.3 354612  7624 ?        S    00:55   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3572  0.0  0.3 354612  7620 ?        S    00:55   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3580  0.0  0.3 354612  7624 ?        S    00:55   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3581  0.0  0.3 354612  7624 ?        S    00:55   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3787  0.0  0.3 354612  7620 ?        S    00:55   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3797  0.0  0.3 354612  7652 ?        S    00:55   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3826  0.0  0.3 354612  7620 ?        S    00:56   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3944  0.0  0.3 354612  7652 ?        S    00:56   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3946  0.0  0.3 354612  7624 ?        S    00:56   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3951  0.0  0.3 354612  7624 ?        S    00:56   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3953  0.0  0.3 354612  7624 ?        S    00:56   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3954  0.0  0.3 354612  7632 ?        S    00:56   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3955  0.0  0.3 354656  7668 ?        S    00:56   0:00 lsphp5:/home/undercur/public_html/apc-ucn.php
    undercur  3970  0.0  0.3 354612  7676 ?        S    00:57   0:00 lsphp5:/undercur/public_html/wp-admin/admin-ajax.php
    undercur  3971  0.0  0.3 354612  7620 ?        S    00:57   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3972  0.0  0.3 354612  7624 ?        S    00:57   0:00 lsphp5:/home/undercur/public_html/index.php
    undercur  3973  0.0  0.3 354612  7624 ?        S    00:57   0:00 lsphp5:/home/undercur/public_html/index.php
    ---
    Uptime at the time:
    Code:
     00:57:16 up 19 days, 16:31,  1 user,  load average: 0.08, 0.24, 0.35
    ---
    Memory at the time:
    Code:
                 total       used       free     shared    buffers     cached
    Mem:          2048       1281        766          0          0        122
    -/+ buffers/cache:       1158        889
    Swap:         2048        332       1715
    ---
    Code:
            |-litespeed(481)-+-httpd(482)
            |                `-litespeed(483)-+-lsphp5(3401)-+-lsphp5(3542)
            |                                 |              |-lsphp5(3543)
            |                                 |              |-lsphp5(3545)
            |                                 |              |-lsphp5(3548)
            |                                 |              |-lsphp5(3571)
            |                                 |              |-lsphp5(3572)
            |                                 |              |-lsphp5(3580)
            |                                 |              |-lsphp5(3581)
            |                                 |              |-lsphp5(3787)
            |                                 |              |-lsphp5(3797)
            |                                 |              |-lsphp5(3826)
            |                                 |              |-lsphp5(3944)
            |                                 |              |-lsphp5(3946)
            |                                 |              |-lsphp5(3951)
            |                                 |              |-lsphp5(3953)
            |                                 |              |-lsphp5(3954)
            |                                 |              |-lsphp5(3955)
            |                                 |              |-lsphp5(3970)
            |                                 |              |-lsphp5(3971)
            |                                 |              |-lsphp5(3972)
            |                                 |              `-lsphp5(3973)
            |                                 |-{litespeed}(484)
            |                                 `-{litespeed}(485)
    
    ---
    ProcessGroup conf ( /usr/local/apache/conf/includes/pre_virtualhost_global.conf ):
    Code:
    <IfModule LiteSpeed>
    LSPHP_ProcessGroup on
    LSPHP_Workers 35
    </IfModule>
    ---
    [5] I can not reproduce it intentionally. I've tried running siege with different numbers of concurrent connections but it did not trigger the bug
    [6]
    stderr.log:
    ---
    Code:
    2013-12-31 00:54:00.010 [STDERR] Child process with pid: 3406 was killed by signal: 15, core dump: 0
    2013-12-31 00:54:22.001 [STDERR] Child process with pid: 3424 was killed by signal: 15, core dump: 0
    ... (more "Child process killed by signal" lines) ...
    ---

    error_log:
    ---
    Code:
    2013-12-31 00:53:55.633 [INFO] [107.22.57.61:56479-0#APVH_undercurrentnews.com] connection to [/tmp/lshttpd/APVH_undercurrentnews.com_Suphp.sock] on request #0, confirmed, 0, associated process: -1, running: 0, error: Connection reset by peer!
    2013-12-31 00:53:55.676 [INFO] [APVH_undercur_Suphp:] PID: 483, add child process pid: 3401, procinfo: 0x1596bc0
    2013-12-31 00:53:59.136 [INFO] [220.255.1.62:43447-0#APVH_undercurrentnews.com] Abort request processing by PID:3406, kill: 1, begin time: 2, sent time: 2, req processed: 0
    2013-12-31 00:54:00.000 [INFO] [CLEANUP] Send signal: 10 to process: 3406
    2013-12-31 00:54:21.792 [INFO] [60.249.204.55:4208-0#APVH_undercurrentnews.com] Abort request processing by PID:3424, kill: 1, begin time: 17, sent time: 17, req processed: 0
    ... (more "[CLEANUP] Send signal" and "Abort request processing" lines) ...
    ---
    [7] No core dumps in /tmp/lshttpd/
  2. NiteWave

    NiteWave Administrator

    please try to disable APC or use xcache.

    APC latest stable version may not work well with php 5.4

    is it lsapi 6.6(latest) ?
    /usr/local/lsws/fcgi-bin/lsphp5 -i |head
    can tell the version
  3. alex.b

    alex.b New Member

    Hi,

    Yes, LSAPI is 6.6 and APC is 3.1.13.

    OK, I'll give XCache a shot. Thanks :)


    Take care,
    Alex
  4. alex.b

    alex.b New Member

    Hi,

    Switching to XCache did not help - true, it was serving requests faster than APC, but again there were stuck processes and 500s.

    I've now disabled caching altogehter and it seems to have done the trick - requests are being served a bit slower, more CPU is being used, but there are no more jammed requests (for now at least).

    Is there no way to use caching? From your last reply I take it it probably has nothing to do with LiteSpeed or am I wrong? I'd appreciate any info/suggestions...

    Thanks :)


    Take care,
    Alex
  5. NiteWave

    NiteWave Administrator

  6. alex.b

    alex.b New Member

    Hi,

    Thanks for the suggestion - I'll try it out.


    Take care,
    Alex

Share This Page