Rails MySQL disconnects

#1
I have a strange problem that I hope you can help solve.

I have 18 web servers running Litespeed 3.1.1 standard 32bit. All web servers are started and stopped at the same time via a capistrano recipe.

I have my rails application email me any uncaught exceptions. About every 20 to 30 minutes I get a mass of exceptions reporting "Mysql::Error: Lost connection to MySQL server during query", or something very similar.

On the mysql servers I get this:

Aborted connection 11221 to db: 'xxxxx' user: 'xxxxx' host: 'xxx.xxxxxxx.internal' (Got timeout reading communication packets)

I was getting these messages constantly before I upped the following values to this:

LSAPI_MAX_REQS=5000
LSAPI_MAX_IDLE=180

I now just get them every 20 to 30 minutes or so.

The really strange thing is that all the disconnects happen at pretty much the same time on all of the servers, and I think I am getting 1 error for every fcgi process, though I am not sure how to confirm this.

I am guessing that litespeed is reaping the fcgi processes at some interval (which is why it happens on all the servers at the same time. Remember that all servers are restarted at the same time by a script when we push code), and they are not closing their mysql connections properly.

So, what happens is we get lots of reports from users and customers that they see our default error page.

I have done a lot of testing, first assuming this was a database issue. Finally yesterday I did some long running tests on one server, targeting one database, and then stopping litespeed, firing up 4 mongrels and repeating the tests (about 2 hours hitting each mongrel once per second). During this test I received no errors. So, at this point I am presuming it must have something to do with litespeed and the fcgi processes.

Any information or help is much appreciated.

Tim
 

mistwang

LiteSpeed Staff
#3
ruby-lsapi has built-in process manager to dynamically start and stop children ruby processes, the Lost DB connection exception might be thrown when one children process exit. However, it only happen when the ruby process is idle and waiting for next request, there should not be any active SQL query in progress, unless a background ruby thread is doing that. And this should not cause any trouble.

Mongrel does not do dynamic spawning, so this issue does happen.

To avoid this exception, you can try letting ruby-lsapi children processes live as long as they can by increasing the value of the two environment variables you mentioned as well as setting "LSAPI_MAX_IDLE_CHILDREN" to the same value of "Max Connection" of the Rails application configuration.
 

nathanc

Well-Known Member
#4
I've has a similar problem. My sites were getting MySQL connection issues. When I restarted the mongrels, the mysql connection was fine once again. Turns out, the MySQL driver doesn’t properly timeout connections. you could do this:
ActiveRecord::Base.verification_timeout = 14500

Or to any value that is lower than the MySQL server’s interactive_timeout setting.

Or you can instead use the native MySQL driver instead of the pure-ruby one.
I did the following:
#up2date mysql mysql-devel
#gem install -r mysql -- --with-mysql-config=/usr/bin/mysql_config

Cheers
 
Top