Processes hanging

Discussion in 'General' started by GOT, Mar 12, 2014.

  1. GOT

    GOT Member

    We have a very busy wordpress site that runs on five application servers running lsws. Several times a day, we get to a point where all the php processes are locked and the server stops responding to new requests. IE:

    17885 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/frame_contentad.php
    17886 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17891 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    17895 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    17897 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    17900 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17902 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17906 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    17913 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/frame_contentad.php
    17915 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/frame_contentad.php
    17916 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17926 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/tlv_300x250.php
    17927 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17939 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17940 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_728x90.php
    17942 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    17943 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17950 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    17953 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    17961 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17964 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    17966 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_160x600.php
    17968 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_160x600.php
    17969 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_160x600.php
    17970 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17971 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/frame_contentad.php
    17972 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    17973 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17988 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    17994 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/tlv_300x250.php
    17995 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php
    18005 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_160x600.php
    18008 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    18010 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    18011 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/openx_300x250.php

    But there are hundreds of them. The end looks like this:

    18353 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/index.php
    18354 ? S 0:00 | \_ lsphp5:/home/mobile/public_html/tlv_300x250.php
    10342 ? S 0:00 \_ lsphp5
    13336 ? S 0:00 \_ lsphp5
    13645 ? S 0:00 \_ lsphp5
    13648 ? S 0:00 \_ lsphp5
    13649 ? S 0:00 \_ lsphp5
    13654 ? S 0:00 \_ lsphp5
    13656 ? S 0:00 \_ lsphp5
    13706 ? S 0:00 \_ lsphp5
    17836 ? S 0:00 \_ lsphp5
    17933 ? S 0:00 \_ lsphp5

    I can issue a graceful restart and this corrects the problems, but the php processes hang around for a while on the old lsws processes. They do go away eventually.

    Where can I look to find why these processes are hanging and what can I do to prevent this from happening?
  2. NiteWave

    NiteWave Administrator

    you can strace one of lsphp5 process when it's hanging, check what it's doing
  3. GOT

    GOT Member

    This is what strace says

    [root@app04 ~]# strace -p 24267
    Process 24267 attached - interrupt to quit
    futex(0x7f9d7b6d5088, FUTEX_WAIT, 2, NULL

    Not sure what that means though...
  4. NiteWave

    NiteWave Administrator

    likely deadlock happened in php process.
    here's a link FYI:
    https://bugs.php.net/bug.php?id=65435

    what's the php version ? have you used APC ? it's known APC is not working well with php 5.4 and above
    also please check the mysqld status when deadlock happens.
    #mysqladmin processlist
  5. GOT

    GOT Member

    We are using php 5.3.27 and yes we are using APC.

    We had some database issues but we got those cleared up about 72 hours ago. These lockups are now only happening on one app server at a time at random times. No evidence that this is related to the database at this time,
  6. NiteWave

    NiteWave Administrator

    5 problem servers --> 1 problem server only, already big progress. can you try disable APC on the problem app server
  7. GOT

    GOT Member

    Well, the issue is not limited to any one app server in particular. Any of the five app servers might get hung this way.

    I have disabled the APC before and the problem is that load goes through the roof if we do that.

    Should I consider a different cache like xcache?
  8. NiteWave

    NiteWave Administrator

    or OPCache:
    php.net/opcache
  9. GOT

    GOT Member

    OK, I have installed Opcache per your suggestion. Hopefully that takes care of it.

Share This Page