Premature end of response header, and 503 service errors (maybe related, maybe not)

Discussion in 'Bug Reports' started by fantasydreaming, Dec 8, 2006.

  1. fantasydreaming

    fantasydreaming New Member

    I'm getting tons of these, and I can't figure out why. Several per second when it's happening, which isn't always.

    2006-12-07 20:19:27.370 [NOTICE] [136.182.158.153:61844-0#ap] Premature end of response header.
    2006-12-07 20:19:27.709 [NOTICE] [24.171.94.109:2024-0#ap] Premature end of response header.
    2006-12-07 20:19:27.822 [NOTICE] [69.246.14.128:2349-0#ap] Premature end of response header.

    Also, when my rails application attempts to send a 404 header, litespeed sends a 503 header and doesn't log anything about why to it's stderr or error logfiles.

    The logfile shows that a 200 response was sent - wierd!
    67.171.73.40 - - [08/Dec/2006:11:07:26 -0600] "GET /asdfasdf2 HTTP/1.1" 200 - "-" "Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.0.4) Gecko/20060613 Camino/1.0.2"

    Curl shows the 503 header
    everybean:~/ap inspire$ curl --head http://allpoetry.com/asdfasdf2
    HTTP/1.1 503 Service Unavailable
    Server: LiteSpeed
    Date: Fri, 08 Dec 2006 17:08:06 GMT
    Connection: close

    Lightppd at the localhost (sorry, rails' default), shows the proper header
    everybean:~/ap inspire$ curl --head http://localhost:3000/asdfasdf2
    HTTP/1.1 404 Not Found
    Connection: close
    Date: Fri, 08 Dec 2006 17:07:36 GMT
    Content-Type: text/html; charset=iso-8859-1
    Set-Cookie: _session_id=blah; path=/; expires=Fri, 05 Jan 2007 16:29:51 GMT
    Cache-Control: no-cache
    Expires: -1
    Server: lighttpd/1.3.13

    No core files, running 2.2.5 std, though the same happens on my other server running 2.2.6

    Thanks for any help!
  2. fantasydreaming

    fantasydreaming New Member

    While testing the 404 -> 503 problems, no "premature response" messages were generated in the log, so likely they're different problems, but you never know.
  3. mistwang

    mistwang LiteSpeed Staff

    Premature end of response header probably is caused by crashed ruby process, which did not send back the response header or just half of it.
    For 404 -> 503 problem, I don't know why it happens, maybe something wrong with that the way a routing error was handled in your application. Create a test Rails application and give it try.
  4. fantasydreaming

    fantasydreaming New Member

    1) Shouldn't crashed ruby processes show up in the error log?

    2) The 404 is returned fine under lighttpd and webrick... it's only when litespeed is in the steps that it's a problem.

    Exact same application, webrick @ port 3000, litespeed at port 80
    everybean:~/storywrite inspire$ curl --head http://allpoetry.com:3000/testas3
    HTTP/1.1 404 Not Found
    Cache-Control: no-cache
    Connection: Keep-Alive
    Expires: -1
    Date: Fri, 08 Dec 2006 22:24:25 GMT
    Content-Type: text/html; charset=iso-8859-1
    Server: WEBrick/1.3.1 (Ruby/1.8.5/2006-08-25)
    Content-Length: 0
    Set-Cookie: _session_id=bb2ed7dcf338e4dd1ccb730eb570b012; path=/; expires=Fri, 05 Jan 2007 22:24:05 GMT

    everybean:~/storywrite inspire$ curl --head http://allpoetry.com/testas3
    HTTP/1.1 503 Service Unavailable
    Server: LiteSpeed
    Date: Fri, 08 Dec 2006 22:24:34 GMT
    Accept-Ranges: bytes
    Connection: close
    ETag: "dad-45799445-4ed154"
    Last-Modified: Fri, 08 Dec 2006 16:35:17 GMT
    Content-Type: text/html
    Content-Length: 3501
  5. mistwang

    mistwang LiteSpeed Staff

    We need more information to figure out what exactly causing the problem.

    Please try turn on LSWS debug logging by changing debug level to "high", post the section of error.log for the failed request.

    Strace the ruby process. First send a good request to let the ruby process start, then do "strace -p <pid_of_ruby>", trace the process with larger PID. send a request result in 503 error, post the strace result.

    check rails' log files.
  6. mistwang

    mistwang LiteSpeed Staff

    Are you using a 404 error page? Is it a static file or handled by rails as well. That could be the cause of 503 problem.
  7. fantasydreaming

    fantasydreaming New Member

    Rails renders the static 404 page with :status => 404... litespeed can handle you setting status codes, right? I don't see why not.

    I'll check the high log level setting tomorrow morning, thanks
  8. fantasydreaming

    fantasydreaming New Member

    Short answer: The rails log shows it successfully processed. The strace of the Railsrunner shows it trying and closing the socket 3 times, and the web server agrees. I think perhaps it's something wrong with the LSAPI hating it when yous end a Status: header with the result?

    LITESPEED LOG

    2006-12-09 09:50:06.521 [DEBUG] [127.0.0.1:64798-0#storywrite] New request:
    Method=[HEAD], URI=[/testas4],
    QueryString=[]
    Content Length=0
    REWRITE] Rule: Match '/testas4' with pattern '^.*$', result: 1
    REWRITE] stat( /Users/inspire/Projects/storywrite/public/../../down.html ) failed
    REWRITE] Rule: Match '/testas4' with pattern '^$', result: -1
    REWRITE] Rule: Match '/testas4' with pattern '^([^.]+)$', result: 2
    REWRITE] Source URI: '/testas4' => Result URI: '/testas4.html'
    REWRITE] Rule: Match '/testas4.html' with pattern '^(.*)$', result: 2
    REWRITE] stat( /Users/inspire/Projects/storywrite/public/testas4.html ) failed
    REWRITE] Source URI: '/testas4.html' => Result URI: 'dispatch.cgi'
    REWRITE] Last Rule, stop!
    Find context with URI: [/dispatch.cgi], location: []
    processContextPath() return 0
    run lsapi processor.
    request [127.0.0.1:64798-0#storywrite:lsapi] is assigned with connection!
    [ExtConn] reconnect()
    [ExtConn] connecting to [uds://tmp/lshttpd/storywrite:_.sock]...
    ExtConn::eek:nWrite()
    request header is done
    ExtConn::continueRead()
    Request body done!
    ExtConn::suspendWrite()
    processNewReq() return 0.
    ExtConn::eek:nRead()
    LsapiConn::doRead()
    process packet header 8 bytes
    process packet header 0 bytes
    ExtConn::eek:nRead()
    LsapiConn::doRead()
    process packet header -1 bytes
    connection to [uds://tmp/lshttpd/storywrite:_.sock] on request #0, error: Connection reset by peer!
    [ExtConn] close()
    No Request has been processed successfully through this connection, the maximum connections allowed will be reduced!
    HttpExtConnector::tryRecover()...
    trying to recover from connection problem, attempt: #1!
    request [127.0.0.1:64798-0#storywrite:lsapi] is assigned with connection!
    [ExtConn] reconnect()
    [ExtConn] connecting to [uds://tmp/lshttpd/storywrite:_.sock]...
    ExtConn::eek:nWrite()
    request header is done
    ExtConn::continueRead()
    Request body done!
    ExtConn::suspendWrite()
    ExtConn::eek:nRead()
    LsapiConn::doRead()
    process packet header 8 bytes
    process packet header 0 bytes
    ExtConn::eek:nRead()
    LsapiConn::doRead()
    process packet header -1 bytes
    connection to [uds://tmp/lshttpd/storywrite:_.sock] on request #0, error: Connection reset by peer!
    [ExtConn] close()
    No Request has been processed successfully through this connection, the maximum connections allowed will be reduced!
    HttpExtConnector::tryRecover()...
    trying to recover from connection problem, attempt: #2!
    request [127.0.0.1:64798-0#storywrite:lsapi] is assigned with connection!
    [ExtConn] reconnect()
    [ExtConn] connecting to [uds://tmp/lshttpd/storywrite:_.sock]...
    ExtConn::eek:nWrite()
    request header is done
    ExtConn::continueRead()
    Request body done!
    ExtConn::suspendWrite()
    ExtConn::eek:nRead()
    LsapiConn::doRead()
    process packet header 8 bytes
    process packet header 0 bytes
    o such file or directory
    o such file or directory
    ExtConn::eek:nRead()
    LsapiConn::doRead()
    process packet header -1 bytes
    connection to [uds://tmp/lshttpd/storywrite:_.sock] on request #0, error: Connection reset by peer!
    [ExtConn] close()
    No Request has been processed successfully through this connection, the maximum connections allowed will be reduced!
    HttpExtConnector::tryRecover()...
    HttpIOLink::continueWrite()...
    write resumed!
    HttpIOLink::handleEvents() events=4!
    HttpIOLink::suspendWrite()...
    HttpConnection::sendHttpError(),code=503 Service Unavailable
    HttpConnection::flush()!
    Written to client: 111
    HttpConnection::nextRequest()!
    HttpExtConnector::cleanUp() ...
    abort request...
    release ExtProcessor!
    Non-KeepAlive, CLOSING!
    Shutting down out-bound socket ...
    HttpIOLink::handleEvents() events=17!
    Close socket ...

    ******* STRACE OF RAILSRUNNER

    select(1, [0], NULL, NULL, {1, 0}) = 1 (in [0], left {0, 525000})
    --- SIGCHLD (Child exited) @ 0 (0) ---
    wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], WNOHANG|WSTOPPED, NULL) = 8839
    wait4(-1, 0x7fbfffd454, WNOHANG|WSTOPPED, NULL) = 0
    rt_sigreturn(0xffffffffffffffff) = 1
    nanosleep({0, 10000}, NULL) = 0
    select(1, [0], NULL, NULL, {0, 0}) = 1 (in [0], left {0, 0})
    accept(0, {sa_family=AF_FILE, path="??<"}, [18051271735726047234]) = 3
    clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2a955619d0) = 9153
    close(3) = 0
    select(1, [0], NULL, NULL, {1, 0}) = 0 (Timeout)
    kill(8492, SIG_0) = 0
    select(1, [0], NULL, NULL, {1, 0}) = ? ERESTARTNOHAND (To be restarted)
    --- SIGCHLD (Child exited) @ 0 (0) ---
    wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], WNOHANG|WSTOPPED, NULL) = 9153
    wait4(-1, 0x7fbfffd454, WNOHANG|WSTOPPED, NULL) = 0
    rt_sigreturn(0xffffffffffffffff) = -1 EINTR (Interrupted system call)
    select(1, [0], NULL, NULL, {0, 953000}) = 1 (in [0], left {0, 575000})
    nanosleep({0, 10000}, 0) = ? ERESTART_RESTARTBLOCK (To be restarted)
    --- SIGCHLD (Child exited) @ 0 (0) ---
    wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], WNOHANG|WSTOPPED, NULL) = 8502
    wait4(-1, 0x7fbfffd474, WNOHANG|WSTOPPED, NULL) = -1 ECHILD (No child processes)
    rt_sigreturn(0xffffffffffffffff) = -1 EINTR (Interrupted system call)
    select(1, [0], NULL, NULL, {0, 0}) = 1 (in [0], left {0, 0})
    accept(0, {sa_family=AF_FILE, path="??<"}, [18051271735726047234]) = 3
    clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2a955619d0) = 9154
    close(3) = 0
    select(1, [0], NULL, NULL, {1, 0}) = 0 (Timeout)
    kill(8492, SIG_0) = 0
    select(1, [0], NULL, NULL, {1, 0}) = ? ERESTARTNOHAND (To be restarted)
    --- SIGCHLD (Child exited) @ 0 (0) ---
    wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], WNOHANG|WSTOPPED, NULL) = 9154
    wait4(-1, 0x7fbfffd454, WNOHANG|WSTOPPED, NULL) = -1 ECHILD (No child processes)
    rt_sigreturn(0xffffffffffffffff) = -1 EINTR (Interrupted system call)
    select(1, [0], NULL, NULL, {0, 934000}) = 0 (Timeout)

    ****** RAILS LOG ********


    Processing UserController#show (for 67.171.73.40 at 2006-12-09 09:17:35) [HEAD]
    Session ID: 0ca5e149498d890db933851b193e6421
    Parameters: {"action"=>"show", "id"=>"testas6", "controller"=>"user"}
    *application.rb:124 reloading vars, last was 2 hours ago
    MemCache Get online
    MemCache Get user/show/testas6
    User Load (0.000910) SELECT * FROM users WHERE (name='testas6') LIMIT 1
    Namechange Load (0.000369) SELECT * FROM ap.namechanges WHERE (start='testas6') ORDER BY created_at desc LIMIT 1
    Rendering public/404.html (404)
    MemCache Put session:0ca5e149498d890db933851b193e6421 expires 0
    UserController: missing default helper path user_helper
    MemCache Get session:bda40d81bf8501de7187f7beb6b71e3f
  9. mistwang

    mistwang LiteSpeed Staff

    The lsws log shows that, ruby just simply closed the connection, not sending anything back.

    The ruby process you straced is the parent ruby process which does nothing but accept a connection and fork a child ruby process to handle the request. I really interested is what happened in the child process. I think it crashed, probably.

    Is the rails log from litespeed or from other setup?

    Are you setup your Rails app with the easy configuration? or step by step manually?
    You are using rewrite rules to dispatch the request, have you tried using 404 handler?

    Let me know your exact rails setup, I will try to duplicate it in our lab. :)
  10. fantasydreaming

    fantasydreaming New Member

    I'm using the "easy" setup. I could try switching to the manual one... So it's 404 handler is automatically handling the ruby process? Maybe it's angry that I'm returning 404 again, so it tries to send it back to the 404 handler, which sends it there again ... etc.

    Rails "suexec virtual host" template included, I turned off "restricted, changed the user/group to a new user, and added the rewrite rules:
    RewriteRule ^$ index.html [QSA]
    RewriteRule ^([^.]+)$ $1.html [QSA]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteRule ^(.*)$ dispatch.cgi [QSA,L]

    How would I get the PID of the actual ruby process? Does it have the same name as "ruby /opt/lsws/fcgi-bin/RailsRunner.rb"? Each "process" really starts two of these? It thought it was the web server that handled the forking/etc...?
  11. mistwang

    mistwang LiteSpeed Staff

    I tried your configuration, proper 404 page was returned.

    LSWS only start one ruby process, this process will fork children process to handle all requests. This way, the startup cost of RoR framework is the lowest.

    To get PID of the children process, send a normal request without resulting in 503 first, so the child process will be forked already, you should see two "ruby ...RailsRunner.rb" process at this moment, the child process should have a higher PID number. strace that process.

    And check if the child process disappeared after processing the 404 request, it should not. if it does, means it crashed for some reason.
  12. fantasydreaming

    fantasydreaming New Member

    Solved!

    It appears that the problem was that I was being stupid (doh!) and using a relative url to render the 404 file, "../../public/404.html". Switching it to "#{RAILS_ROOT}/public/404.html" solved the problem.

    The parent-fork thing must change the working directory, which caused the discrepancy between litespeed and other servers. It might be better to reset the cmd if possible, just to keep it consistent.

    Stracing the proper pid let me see that it was getting the error, though i'm still not sure why 1) rails crashes like that, or 2) litespeed shows it as a 503 rather than 500. It's not a timeout, it happens right away...

    write(1, "Rendering public/404.html (404)\n", 32) = 32
    open("public/404.html", O_RDONLY) = -1 ENOENT (No such file or directory)
    rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
    getcwd("/usr/local/bin", 200) = 15
    getcwd("/usr/local/bin", 200) = 15
    getcwd("/usr/local/bin", 200) = 15
    open("public/404.html", O_RDONLY) = -1 ENOENT (No such file or directory)
    write(4, "set aph:session:f802551cd64f1e21"..., 961) = 961
    read(4, "STORED\r\n", 1024) = 8
    gettimeofday({1165712051, 488544}, NULL) = 0
    fstat(1, {st_mode=S_IFREG|0644, st_size=5343777, ...}) = 0
    write(1, "MemCache Put session:f802551cd64"..., 64) = 64
    rt_sigaction(SIGINT, {SIG_DFL}, {0x46b990, [INT], SA_RESTORER|SA_RESTART, 0x3cfa62e410}, 8) = 0
    close(4) = 0
    munmap(0x2a97b36000, 4096) = 0
    close(4) = -1 EBADF (Bad file descriptor)
    munmap(0x2a97b35000, 4096) = 0
    poll([{fd=5, events=POLLIN|POLLPRI}], 1, 0) = 0
    setsockopt(5, SOL_SOCKET, SO_SNDTIMEO, "\2003\341\1\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
    write(5, "\1\0\0\0\1", 5) = 5
    shutdown(5, 2 /* send and receive */) = 0
    close(5) = 0
    exit_group(1) = ?
    Process 13473 detached

    Thanks again! You guys rock! Litespeed is the bestest project ever....
  13. mistwang

    mistwang LiteSpeed Staff

    Glad you find the problem, we will address the compatibility issue in next ruby-lsapi release for sure. :)

    What exactly is the result of getcwd() under other environment? #{RAILS_ROOT}/public?
  14. mistwang

    mistwang LiteSpeed Staff

Share This Page