kernel: Bad page state in process 'lshttpd'

  lopis

    lopis New Member

    Message from syslogd@ at Mon Jul 13 07:49:20 2009 ...                                                                                       715,1         69%
    host kernel: Bad page state in process 'lshttpd'
    Message from syslogd@ at Mon Jul 13 07:49:20 2009 ...
    host kernel: page:c33fb540 flags:0xc0080010 mapping:00000000 mapcount:0 count:0 (Not tainted)
    Message from syslogd@ at Mon Jul 13 07:49:20 2009 ...
    host kernel: Trying to fix it up, but a reboot is needed
    I can't seem to explain why I'm getting plague by this over the past few months. ;/
  mistwang

    mistwang LiteSpeed Staff

    Sorry about the problem you experienced.
    Looks like lshttpd triggered a kernel bug.

    Which version of linux and kernel are you using?
    search the error message "Bad page state in process", I got many discussions about it in the kernel mailing list.
    Only upgrade/downgrade to a kernel without this bug can fix this I think.
  mistwang

    mistwang LiteSpeed Staff

    go over your thread at webhostingtalk, looks like you have been trying different kernel, since many people using RE5 stable kernel on their server, if it is a kernel bug with the stable kernel, a lot of users would report the same problem.

    Have you swapped the hardware, memory? CPU? motherboard?
  lopis

    lopis New Member

    I recently took a trip so I was gone for about a week, which led to people leaving my host, however, the server managed to stay stable and was up for 5 days, till tonight which is about 24hrs later after I told everyone I was back and it looks like some clients came back over.

    I've ran all kernels and im right now back on the latest stable RH5.

    hardware was also changed since i changed datacenters.
  mistwang

    mistwang LiteSpeed Staff

    I think it is a kernel bug, find this

    If you can, you should provide the kernel backtrace to help identify the problem.
  lopis

    lopis New Member

    How do i do a backtrace? and what kernel do you recommend that i upgrade to?
  yuho

    yuho New Member

    setup a 1hr cron job to do that following:

    kilall -9 lshttpd
    /sbin/service lsws start
    sync; echo 3 > /proc/sys/vm/drop_caches
    that will do a hard reset and prob fix your problem


    could this be an LSAPI issue? You said you had it running stable when you had less sites.
  mistwang

    mistwang LiteSpeed Staff

    Never debugged a kernel bug, so I am not sure either. but I think it is should be a kernel configuration to let kernel dump a backtrace like the one I posted.

    The RHEL5 kernel should be stable, otherwise, we will be flooded with bug reports like yours.

    Is there any kernel parameter, tuning applied to the default setup?

    Another suggestion is to load 64bit Linux if your server has more than 4GB memory, otherwise get rid of the PAE kernel.

    You can try restarting LSWS regularly as suggested as well.
  lopis

    lopis New Member

    The script above works... just went 1 day with uptime, so I'll let it go for another day and see how it goes. if its still stable then I'm going to try a few things and narrow down the cause. I'll try removing drop_caches first and see what happens.

    so far my ideas are:
    - bad sector in memory?
    - litespeed in some way caching which causes a corruption?

    what i cant figure out is what im doing so different then all the other CPanel setups that run LS(default config) with probably more active sites then me. It could be a kernel issue as you pointed out... so I'll try upgrading to the latest and see what happens later this week.

