FastCGI spawning more processes

#1
Hello,

we are running LiteSpeed 2.x (and tested with 3.x) with FastCGI. Our application uses one fastcgi, which does AJAX response (actually checks with the imap server, which could be slower from time to time). What is a problem is, that we have set soft and hard process limits like this:

Code:
    <extProcessor>
      <type>fcgi</type>
      <name>checkmail</name>
      <address>UDS://tmp/lshttpd/fcgi/checkmail</address>
      <maxConns>30</maxConns>
      <initTimeout>30</initTimeout>
      <retryTimeout>0</retryTimeout>
      <persistConn>0</persistConn>
      <pcKeepAliveTimeout></pcKeepAliveTimeout>
      <respBuffer>1</respBuffer>
      <autoStart>1</autoStart>
      <path>/www/checkmail.fcgi</path>
      <backlog>10</backlog>
      <instances>3</instances>
      <runOnStartUp>0</runOnStartUp>
      <extMaxIdleTime>120</extMaxIdleTime>
      <priority></priority>
      <memSoftLimit>1024000000</memSoftLimit>
      <memHardLimit>2048000000</memHardLimit>
      <procSoftLimit>20</procSoftLimit>
      <procHardLimit>40</procHardLimit>
    </extProcessor>
The problem is, that neither soft nor hard process limit is enforced in peak time. We don't require this script to be ultra-fast, but as it consumes lots of memory (it's a perl script), running 160 of these processes is quite a big problem.

How do I tell lsws to never ever run more than procHardLimit processes?

I currently do this by setting max-lwp for lshttpd project, so if there's more than 130 processes run from litespeed, additional forks get Resource temporarily unavailable. But since this litespeed server serves more scripts, this takes the whole site down in peak time (as opposed to taking the whole machine on swapping an I/O, so it's an improvement).

Please let me know if you need aditional information.
 

mistwang

LiteSpeed Staff
#2
Can the checkmail.fcgi fork itself to handle multiple requests? If not, you need to set "maxConns" to match "instances".

procSoftLimit and procHardLimist should not be used for that purpose, you should use "maxConns" and "instances" to control how many fcgi processes to use. When you set process limit too low, LSWS have to start the fcgi again and again due to failures caused by the process limit.

If you are trying the Enterprise, it is a 2CPU license, so the max number of processes can be started is twice the value of "instances".
 
#3
did not help

It's an enteprise version, so I set instances and maxconns to 10 (should be less than process limit). In few seconds, I got this:

# pgrep checkmail|wc -l
119

Currently, I worked around this problem, so it works at least while I find solution: I installed Sun Web server, it starts only that many processes, that are required/permitted and I am using lsws as a reverse proxy, but I would like to solve this better (as opposed to work around the problem).

I can not change the checkmail.fcgi script, I am only system administrator, but I suggested that before to the development team (I even think that it's better to make it standalone http server using perl's Net:HTTP::Daemon and use lsws as a reverse proxy for each script).
 

mistwang

LiteSpeed Staff
#4
All processes running as that user will be counted against the process limit, so "soft limit"=20 is too low, and will definitely causes trouble. You can lsws/logs/error.log and stderr.log.
 
#5
Did not help either. I set process soft limit to 250 and process hard limit to 300. I have 53 running processes started from lshttpd (when not counting checkmail.fcgi and counting lshttpd itself) and instances set to 10, so it still forks more processes. stderr is full of fork: Resource temporarily unavailable, since it reaches process limit (project.max-lwp=180 for that project). While it should fork 20 processes (instances and maxconns = 10, it's an enterprise version, so twice that), it stops at 180 processes (it should normally be around 50 + 20, so much less than 100).
 

mistwang

LiteSpeed Staff
#6
Does checkmail.fcgi itself fork? or all of them are forked by LSWS? You can check the parent pid of checkmail.fcgi . If the request takes longer than 30 seconds to process, you need increase the initial timeout.
 
#7
checkmail.fcgi does not fork, all of them are forked by lsws. Request does not take longer than 30 seconds (lsws forking starts after few seconds).
 

mistwang

LiteSpeed Staff
#8
Have you ever strace those checkmail.fcgi processes? LSWS should not use more than 20 of them, so, the rest instances are probably in a malfunctioning state, maybe are dead locked?

Does the fcgi process have to be kill with "-9" or a normal "-TERM" will do?
Do checkmail.fcgi processes get killed when LSWS restarts?

Under current architecture, LSWS does not know which process is bad and kill it. We will make some architecture changes to make it possible. Will take a little while.
 
Top