LiteSpeed Technologies
Download Download     Blog Blog     Wiki Wiki     Forum Forum     Store     Contact Contact    

Go Back   LiteSpeed Support Forums > LiteSpeed Web Server > Bug Reports > lsapi processes not being used, build up to > max_connections

Reply
 
Thread Tools Display Modes
  #1  
Old 11-20-2006, 03:32 PM
fantasydreaming fantasydreaming is offline
Senior Member
 
Join Date: Sep 2006
Posts: 76
Default lsapi processes not being used, build up to > max_connections

Even more alarmingly, shows all 10 of my lsapi processes as 'in use' (on a fairly busy server), but when I strace them, they're all in select(11...)

Scope Type Name Max CONN Eff Max Pool In Use Idle WaitQ Req/Sec
ap LSAPI Rails:ap:/ 10 10 10 10 0 8 14

Scope Type Name Max CONN Eff Max Pool In Use Idle WaitQ Req/Sec
ap LSAPI Rails:ap:/ 10 10 10 10 0 27 0

It only looks like one of the lsapi processes is doing anything from what I can tell with strace.

Additionally, maxconns is 10, but there are 22 running:
[root@rhyme data]# ps auxww | grep RailsRunner | wc -l
22

Any ideas? It's generally very fast, but can build up slow during the busy times. I've recently upgraded to litespeed from lighttpd and am overall very happy, but worried about this.

Mysql shows no slow queries, server load is only at .4.

Thank you,
Kevin
Reply With Quote
  #2  
Old 11-20-2006, 03:46 PM
fantasydreaming fantasydreaming is offline
Senior Member
 
Join Date: Sep 2006
Posts: 76
I should add, that one of them seems stuck in 'nanosleep' versus the expected select():

nanosleep({0, 10000000}, NULL) = 0
nanosleep({0, 10000000}, NULL) = 0
nanosleep({0, 10000000}, NULL) = 0
nanosleep({0, 10000000}, NULL) = 0
nanosleep({0, 10000000}, NULL) = 0
Reply With Quote
  #3  
Old 11-20-2006, 04:10 PM
fantasydreaming fantasydreaming is offline
Senior Member
 
Join Date: Sep 2006
Posts: 76
Two different other kinds of sleep:

Fresh processes (looks still right)
select(1, [0], NULL, NULL, {0, 433000}) = 0 (Timeout)
kill(27359, SIG_0) = 0
select(1, [0], NULL, NULL, {1, 0}) = 0 (Timeout)
kill(27359, SIG_0) = 0
select(1, [0], NULL, NULL, {1, 0}) = 0 (Timeout)



Another one, not sure what htis is, but perhaps it's just something inside my application. More gettimeofday() going on.

gettimeofday({1164067094, 613573}, NULL) = 0
gettimeofday({1164067094, 613617}, NULL) = 0
select(8, [3 7], [], [], {0, 999956}) = 0 (Timeout)
gettimeofday({1164067095, 612726}, NULL) = 0
select(8, [3 7], [], [], {0, 847}) = 0 (Timeout)
gettimeofday({1164067095, 613680}, NULL) = 0
select(8, [3 7], [], [], {0, 0}) = 0 (Timeout)
kill(27360, SIG_0) = 0
gettimeofday({1164067095, 613777}, NULL) = 0
gettimeofday({1164067095, 613805}, NULL) = 0
select(8, [3 7], [], [], {0, 999971} <unfinished ...>

When I did lswctrl restart before, it wouldn't kill off the lsapi processes stuck in select() w/o the kill every few msec. They must be kind-of crashed, but the 'max workers' checker isn't picking up on it, and I need to kill -5 them to get them to ever go away.

It doesn't seem to happen for awhile after starting the server. Restarting the server fairly early in it's life results in all the processes being killed & re-created as expected.
Reply With Quote
  #4  
Old 11-22-2006, 07:22 PM
mistwang mistwang is offline
LiteSpeed Staff
 
Join Date: May 2003
Location: New Jersey
Posts: 7,590
Is the server doing OK now?
Reply With Quote
  #5  
Old 11-23-2006, 06:36 AM
fantasydreaming fantasydreaming is offline
Senior Member
 
Join Date: Sep 2006
Posts: 76
Default nope

They apparently all crashed this morning resulting in some short downtime.

My server error log has things like this in it, though I think they're normal:

2006-11-22 17:58:35.754 [INFO] [218.185.94.226:16092-0#ap] Connection idle time: 16 while in state: 5 watching for event: 25,close!
2006-11-22 17:58:35.754 [INFO] [218.185.94.226:16092-0#ap] Content len: 0, Request line:
GET /poem/add HTTP/1.1
2006-11-22 17:58:35.754 [INFO] [218.185.94.226:16092-0#ap] Redirect: #1, URL: /dispatch.cgi
2006-11-22 17:58:35.754 [INFO] [218.185.94.226:16092-0#ap] HttpExtConnector state: 8, request body sent: 0, response body size: 0, response body sent:0, left in buffer: 0, attempts: 0.
2006-11-22 17:58:45.954 [INFO] [72.75.105.178:60212-0#ap] Connection idle time: 16 while in state: 5 watching for event: 25,close!
2006-11-22 17:58:45.954 [INFO] [72.75.105.178:60212-0#ap] Content len: 1555097, Request line:
POST /user/face HTTP/1.1
2006-11-22 17:58:45.954 [INFO] [72.75.105.178:60212-0#ap] Redirect: #1, URL: /dispatch.cgi
2006-11-22 17:58:45.954 [INFO] [72.75.105.178:60212-0#ap] HttpExtConnector state: 10, request body sent: 131072, response body size: 0, response body sent:0, left in buffer: 0, attempts: 0.

Also, the 'graceful restart' throws the iowait load on my server crazy and load up to 30 - not sure if perhaps I have a bad scsi cable or something, but it may have to do with the rails processes suck in the 'bad' select(). If I killall -5 ruby & lshttpd, then start the server, it's quick and painless... Likewise, graceful restart fairly quickily after a new start (i.e. no 'bad' selects() yet) seems to work fine.

Note: I'm running this as a user besides 'nobody' - not sure if that could have an impact at all.

Please let me know if there's any debug output or log details I can give you that would help!
Reply With Quote
  #6  
Old 11-23-2006, 12:10 PM
mistwang mistwang is offline
LiteSpeed Staff
 
Join Date: May 2003
Location: New Jersey
Posts: 7,590
You need to find out what exactly causes he bad select(), I think it is in ruby or your rails app, not in LSAPI code.

You can try "lsof" or start "strace" at beginning of a request.

Ruby always resume a function call if it is interrupted (EINTR) by a signal, so sometime it becomes very difficult to kill a ruby process in the normal way.
Reply With Quote
  #7  
Old 12-01-2006, 08:43 AM
fantasydreaming fantasydreaming is offline
Senior Member
 
Join Date: Sep 2006
Posts: 76
This was caused by a search using the ruby-google module... apparently http-access and http-access2 can both get stuck waiting 'forever' for a google (or anywhere, likely) response.

The solution for me was to wrap it in a Timeout::timeout(5) do() block.

I was able to debug it using the gdb, really sweetly following the instructions here: http://eigenclass.org/hiki.rb?ruby+l...+introspection
Reply With Quote
  #8  
Old 12-01-2006, 08:54 AM
mistwang mistwang is offline
LiteSpeed Staff
 
Join Date: May 2003
Location: New Jersey
Posts: 7,590
Cool! I will add that to our Wiki. Thanks!
Reply With Quote
  #9  
Old 12-01-2006, 08:59 AM
fantasydreaming fantasydreaming is offline
Senior Member
 
Join Date: Sep 2006
Posts: 76
Glad I could help

One other place that I had problems with it hanging was wherever I was attempting to resolve IP addresses to DNS names... maybe it was just being glacial, but I'd sometimes get an execution timeout expired error as well.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 04:35 AM.



- Archive - Top
© Copyright 2003-2011 LiteSpeed Technologies, Inc. All rights reserved. Privacy Policy.