LiteSpeed Support Forums

LiteSpeed Support Forums (http://www.litespeedtech.com/support/forum/index.php)
-   Bug Reports (http://www.litespeedtech.com/support/forum/forumdisplay.php?f=9)
-   -   [RESOLVED] Restart problem (old server isn't killed) (http://www.litespeedtech.com/support/forum/showthread.php?t=3752)

Grzegorz Derebecki 02-03-2010 12:30 PM

[RESOLVED] Restart problem (old server isn't killed)
 
After swith to Litespeed 4.0.12 enterprise (before i use standard version) i have problem with restarting

Litespeed version: freebsd 64bit ent 4.0.12

Before restarting:
Code:

root@core2:~# ps aux |grep lshttp
www    33858  0.2  0.1 11572  6504  ??  D    8:40PM  0:00.09 ./lshttpd (lshttpd.4.0.12)
root  33854  0.0  0.1 10412  5248  ??  S    8:40PM  0:00.01 ./lshttpd (lshttpd.4.0.12)

After press restart in admin pannel
Code:

root@core2:~# ps aux |grep lshttp
www    33858  4.6  0.1 13036  7536  ??  SN    8:40PM  0:04.40 ./lshttpd (lshttpd.4.0.12)
root  33854  0.0  0.1 10412  5256  ??  S    8:40PM  0:00.03 ./lshttpd (lshttpd.4.0.12)
root  34344  0.0  0.1 10412  5252  ??  S    8:41PM  0:00.01 lshttpd (lshttpd.4.0.12)
www    34346  0.0  0.1 10492  5668  ??  S    8:41PM  0:00.03 lshttpd (lshttpd.4.0.12)

And after while (when old server is ready to end):
Code:

root@core2:~# ps aux |grep lshttp
www    33858  0.0  0.1 12572  7076  ??  SN    8:40PM  0:04.42 ./lshttpd (lshttpd.4.0.12)
root  34344  0.0  0.1 10412  5252  ??  S    8:41PM  0:00.02 lshttpd (lshttpd.4.0.12)
www    34346  0.0  0.1 11636  6680  ??  S    8:41PM  0:00.48 lshttpd (lshttpd.4.0.12)

As we can see pid 33858 is still running and even i send kill 33858 it don't do anything. Also all lsapi process are runing (php, ruby)



i tested this on 4.1RC2 and there problem don't exists.

mistwang 02-03-2010 12:37 PM

due to cPanel 11.25 graceful restart issue, we have changed graceful restart of 4.0.12 to finish all pending requests completely, my guess is that there are still requests in process. It takes longer for the lingering process to quit.

Please strace and lsof the running process. if there are active ESTABLISHED connections to port 80, it is still serving requests.

Grzegorz Derebecki 02-03-2010 04:11 PM

truss:

Code:

kevent(7,{},0,{},16,{0.100000000 })                = 0 (0x0)
gettimeofday({1265242110.972211 },0x0)                = 0 (0x0)
gettimeofday({1265242110.972268 },0x0)                = 0 (0x0)
kevent(7,{},0,{},16,{0.100000000 })                = 0 (0x0)
gettimeofday({1265242111.071037 },0x0)                = 0 (0x0)
stat("/base/opt/lsws/logs/access.log",{ mode=-rw-r--r-- ,inode=2922266,size=11951,blksize=4096 }) = 0 (0x0)
stat("/base/opt/lsws/logs/error.log",{ mode=-rw-r--r-- ,inode=2923218,size=4336,blksize=4096 }) = 0 (0x0)
stat("/base/opt/lsws/logs/stderr.log",{ mode=-rw-r--r-- ,inode=2923034,size=3977,blksize=4096 }) = 0 (0x0)
gettimeofday({1265242111.071355 },0x0)                = 0 (0x0)
kevent(7,{},0,{},16,{0.100000000 })                = 0 (0x0)
gettimeofday({1265242111.169718 },0x0)                = 0 (0x0)
gettimeofday({1265242111.169785 },0x0)                = 0 (0x0)
kevent(7,{},0,{},16,{0.100000000 })                = 0 (0x0)
gettimeofday({1265242111.268581 },0x0)                = 0 (0x0)
gettimeofday({1265242111.268650 },0x0)                = 0 (0x0)
kevent(7,{},0,{},16,{0.100000000 })                = 0 (0x0)
gettimeofday({1265242111.367619 },0x0)                = 0 (0x0)
gettimeofday({1265242111.367685 },0x0)                = 0 (0x0)
kevent(7,{},0,{},16,{0.100000000 })                = 0 (0x0)
gettimeofday({1265242111.467696 },0x0)                = 0 (0x0)
gettimeofday({1265242111.467748 },0x0)                = 0 (0x0)
kevent(7,{},0,{},16,{0.100000000 })                = 0 (0x0)
gettimeofday({1265242111.567390 },0x0)                = 0 (0x0)
gettimeofday({1265242111.567450 },0x0)                = 0 (0x0)
kevent(7,{},0,{},16,{0.100000000 })                = 0 (0x0)
gettimeofday({1265242111.667418 },0x0)                = 0 (0x0)
gettimeofday({1265242111.667470 },0x0)                = 0 (0x0)


lsof:

Code:

root@core2:/opt/nginx# lsof -n |grep 84359
lshttpd.4 84359    www  cwd    VDIR              0,96              3584  4616504 /tmp/lshttpd
lshttpd.4 84359    www  rtd    VDIR              0,96                512        2 /
lshttpd.4 84359    www  txt    VREG              0,96            3452719  2922218 /base/opt/lsws/bin/lshttpd.4.0.12
lshttpd.4 84359    www  txt    VREG              0,96            197944  1158833 /libexec/ld-elf.so.1
lshttpd.4 84359    www  txt    VREG              0,96              32512  2239860 /base/usr/local/lib/compat/libcrypt.so.3
lshttpd.4 84359    www  txt    VREG              0,96            122032  2244221 /base/usr/local/lib/compat/libm.so.4
lshttpd.4 84359    www  txt    VREG              0,96            1102888  2239855 /base/usr/local/lib/compat/libc.so.6
lshttpd.4 84359    www    0u    VCHR              0,29                0t0      29 /dev/null
lshttpd.4 84359    www    1u    VCHR              0,29                0t0      29 /dev/null
lshttpd.4 84359    www    2u    unix 0xffffff009fa952d0                0t0          ->0xffffff0100a32870
lshttpd.4 84359    www    3w    VREG              0,96              5004  2923218 /base/opt/lsws/logs/error.log
lshttpd.4 84359    www    4r    VCHR              0,10            0t28928      10 /dev/random
lshttpd.4 84359    www    5u    unix 0xffffff00560fb870                0t0          /tmp/lshttpd/fdb.pl_rails.sock.912
lshttpd.4 84359    www    7u  KQUEUE 0xffffff0206f4fb00                            count=0, state=0x2
lshttpd.4 84359    www  11u    IPv4 0xffffff0006afd440                0t0      UDP *:*
lshttpd.4 84359    www  12u    unix 0xffffff0100a32870                0t0          ->0xffffff009fa952d0
lshttpd.4 84359    www  13u    unix 0xffffff0106aa42d0                0t0          /base/opt/lsws/admin/cgid/cgid.sock
lshttpd.4 84359    www  15u    unix 0xffffff0056e8b000                0t0          ->0xffffff0056dfa2d0
lshttpd.4 84359    www  18u    unix 0xffffff00896495a0                0t0          ->0xffffff0213f1eb40
lshttpd.4 84359    www  20u    unix 0xffffff001692f000                0t0          ->0xffffff020bec52d0
lshttpd.4 84359    www  23u    unix 0xffffff0089836b40                0t0          /tmp/lshttpd/gfx1.fdb.pl_lsphp.sock.022
lshttpd.4 84359    www  24u    unix 0xffffff01ff15a5a0                0t0          /tmp/lshttpd/fdb.pl_lsphp.sock.583
lshttpd.4 84359    www  28u    unix 0xffffff01c786f000                0t0          ->0xffffff01530122d0
lshttpd.4 84359    www  29u    unix 0xffffff0029780000                0t0          /tmp/lshttpd/app.eurobattle.net_rails_old.sock.513
lshttpd.4 84359    www  31u    unix 0xffffff01a22b5870                0t0          ->0xffffff008c0e5000
lshttpd.4 84359    www  33w    VREG              0,96              12054  2922266 /base/opt/lsws/logs/access.log
lshttpd.4 84359    www  34w    VREG              0,96              3977  2923034 /base/opt/lsws/logs/stderr.log
lshttpd.4 84359    www  35u    unix 0xffffff0089bc2b40                0t0          /tmp/lshttpd/admin_php.sock.222
lshttpd.4 84359    www  36u    unix 0xffffff01a2a5f2d0                0t0          ->0xffffff01ff0de000
lshttpd.4 84359    www  41u    unix 0xffffff0117b822d0                0t0          ->0xffffff0106ae1870
lshttpd.4 84359    www  43u    unix 0xffffff02293e5000                0t0          ->0xffffff0117aee870
lshttpd.4 84359    www  44u    unix 0xffffff0089d75870                0t0          ->0xffffff008e8e52d0
lshttpd.4 84359    www  54u    unix 0xffffff01a2b145a0                0t0          ->0xffffff022774a5a0


Grzegorz Derebecki 02-03-2010 05:44 PM

after few hour old lshttpd is still runing.

mistwang 02-03-2010 10:28 PM

Please try a force reinstall from web console to apply the latest build of 4.0.12, add some code to deal with it.

Grzegorz Derebecki 02-04-2010 02:32 AM

Quote:

Originally Posted by mistwang (Post 18410)
Please try a force reinstall from web console to apply the latest build of 4.0.12, add some code to deal with it.

Reinstalled (manualy becous force don't works)

web server quite but root quits first then after wihle no-root server quit and live all lsapi process

Code:

root@core2:~# ps aux |grep RAILS|grep in4max
in4max 86743 13.5  2.2 757572 180600  ??  R    11:25AM  0:25.26 ruby: RAILS: fdb.pl (production ID:86734) (ruby)
in4max 86746 10.9  2.2 757572 180840  ??  S    11:25AM  0:28.11 ruby: RAILS: fdb.pl (production ID:86734) (ruby)
in4max 86745  7.1  1.8 728800 152428  ??  S    11:25AM  0:30.65 ruby: RAILS: fdb.pl (production ID:86734) (ruby)
in4max 86744  4.6  2.2 758596 182116  ??  S    11:25AM  0:26.07 ruby: RAILS: fdb.pl (production ID:86734) (ruby)
in4max 84884  0.0  1.8 721632 145796  ??  Ss  11:16AM  0:04.57 ruby: RAILS: fdb.pl (production ID:84884) (ruby)
in4max 84898  0.0  2.3 767812 191056  ??  S    11:16AM  0:19.42 ruby: RAILS: fdb.pl (production ID:84884) (ruby)
in4max 84899  0.0  2.0 738332 162800  ??  S    11:16AM  0:25.92 ruby: RAILS: fdb.pl (production ID:84884) (ruby)
in4max 84900  0.0  2.3 762096 186568  ??  S    11:16AM  0:21.38 ruby: RAILS: fdb.pl (production ID:84884) (ruby)
in4max 84901  0.0  1.8 727776 150796  ??  S    11:16AM  0:27.83 ruby: RAILS: fdb.pl (production ID:84884) (ruby)
in4max 85694  0.0  1.8 721632 145796  ??  Ss  11:20AM  0:04.45 ruby: RAILS: fdb.pl (production ID:85694) (ruby)
in4max 85703  0.0  2.2 757572 180596  ??  S    11:20AM  0:23.22 ruby: RAILS: fdb.pl (production ID:85694) (ruby)
in4max 85704  0.0  2.2 757572 181160  ??  S    11:20AM  0:31.89 ruby: RAILS: fdb.pl (production ID:85694) (ruby)
in4max 85705  0.0  2.2 757572 181136  ??  S    11:20AM  0:27.93 ruby: RAILS: fdb.pl (production ID:85694) (ruby)
in4max 85706  0.0  2.2 757572 180572  ??  S    11:20AM  0:28.23 ruby: RAILS: fdb.pl (production ID:85694) (ruby)
in4max 86734  0.0  1.8 721632 145796  ??  Ss  11:25AM  0:04.42 ruby: RAILS: fdb.pl (production ID:86734) (ruby)

(id = belongs to 1 litespeed server)

this is how it looks after few restarts. php isn't stoped too.

by the way swithing version don't change symlink to lscgid :) and after that we have (after this test i checked to rc2 for another test)

Code:

www    88224  2.9  0.1 19208  9548  ??  D    11:32AM  0:02.18 lshttpd (lshttpd.4.1RC2)
root  88222  0.0  0.0 11972  4012  ??  S    11:32AM  0:00.03 lshttpd (lshttpd.4.1RC2)
root  88223  0.0  0.0  2568  716  ??  S    11:32AM  0:00.00 httpd (lscgid.4.0.12)


mistwang 02-04-2010 09:39 AM

Please check those PHP and Rails process that refuse to quit with ktrace, LSAPI should check ppid every second, if ppid is -1, means that parent process is dead, LSAPI app should quit. I wonder if those LSAPI app stuck some where.

I will check the lscgid issue.

Grzegorz Derebecki 02-04-2010 10:22 AM

Quote:

Originally Posted by mistwang (Post 18417)
Please check those PHP and Rails process that refuse to quit with ktrace, LSAPI should check ppid every second, if ppid is -1, means that parent process is dead, LSAPI app should quit. I wonder if those LSAPI app stuck some where.

I will check the lscgid issue.

I'm using LSAPI_PPID_NO_CHECK becous of my settings in freebsd:
security.bsd.see_other_uids=0
security.bsd.see_other_gids=0

here are my lsapi config:

LSAPI_CHILDREN=4
LSAPI_AVOID_FORK=1
LSAI_MAX_IDLE_CHILDREN=2
LSAPI_MAX_IDLE=600
LSAPI_PPID_NO_CHECK=1
LSAPI_EXTRA_CHILDREN=0

also notice that ANY lsapi process isn't stoped (php/ruby)

In 4.1RC2 all works perfectly.

mistwang 02-04-2010 10:53 AM

Quote:

I'm using LSAPI_PPID_NO_CHECK becous of my settings in freebsd:
security.bsd.see_other_uids=0
security.bsd.see_other_gids=0
That's why those processes do not quit when parent process quit. Need to change the way how graceful shutdown works.

mistwang 02-04-2010 11:02 AM

Maybe you can try remove LSAPI_PPID_NO_CHECK, as latest LSAPI uses getppid(), instead of kill( ppid, 0 ) to test if parent process is alive or not, so the security setting in FreeBSD does affect kill(), but may not affect getppid() at all.


All times are GMT -7. The time now is 09:43 PM.