1 VHost Goes down?!

cmanns

Well-Known Member
#1
This one site's been thinking our hosting always been down, this recently has happened.

For over a month on enterprise no prob, switched to new VM system with 1cpu lciense instead of 2 is only difference in config.

Randomly site doesn't load, takes 2 minutes, php uses 2.6% max cpu or so.

http://pastebin.com/7LrJ9WLg

Code:
   1.
      stat("/home/floppop/xxxforum/sources/lib/func_topic_threaded.php", {st_mode=S_IFREG|0644, st_size=17217, ...}) = 0
   2.
      fcntl(5, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=1}) = 0
   3.
      fcntl(5, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=1}) = 0
   4.
      close(13)                               = 0
   5.
      poll([{fd=12, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
   6.
      write(12, "e\0\0\0\3SELECT pid, post_parent FRO"..., 105) = 105
   7.
      read(12, "\1\0\0\1\2;\0\0\2\3def\rfloppop_forum\tibf_"..., 16384) = 217
   8.
      poll([{fd=12, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
   9.
      write(12, "\246\0\0\0\3SELECT pid, post, author_id"..., 170) = 170
  10.
      read(12, "\1\0\0\1\t;\0\0\2\3def\rfloppop_forum\tibf_"..., 16384) = 1643
  11.
      poll([{fd=12, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
  12.
      write(12, "\311\2\0\0\3SELECT p.*, pp.*,\n\t\t\t\tm.id,"..., 717) = 717
  13.
      read(12, "\1\0\0\1O3\0\0\2\3def\rfloppop_forum\1p\tib"..., 16384) = 7701
  14.
      time(NULL)                              = 1279801627
  15.
      time(NULL)                              = 1279801627
  16.
      time(NULL)                              = 1279801627
  17.
      open("/home/xxx/public_html/forum/sources/classes/class_custom_fields.php", O_RDONLY) = 13
  18.
      fstat(13, {st_mode=S_IFREG|0644, st_size=14922, ...}) = 0
  19.
      stat("/home/xxxpublic_html/forum/sources/classes/class_custom_fields.php", {st_mode=S_IFREG|0644, st_size=14922, ...}) = 0
  20.
      fcntl(4, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=1}) = 0
  21.
      fcntl(4, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=1}) = 0
  22.
      close(13)                               = 0
  23.
      time(NULL)                              = 1279801627
  24.
      time(NULL)                              = 1279801627
  25.
      time(NULL)                              = 1279801627
  26.
      time(NULL)                              = 1279801627
  27.
      time(NULL)                              = 1279801627
  28.
      time(NULL)                              = 1279801627
  29.
      time(NULL)                              = 1279801627
It's totally random, sometimes repairing mysql helps but mysql has no issues, it happened after 2 weeks on the new system of solid uptime, appears to happen more and more often, yet the sites not much traffic.

Restarting server, etc wont fix. it also seems static files wont load fast either so i'm quite confused, the site loads under a second uncached almost and executes in sub .03ms php wise running invision power board.
 

mistwang

LiteSpeed Staff
#2
I think it is a issue with the disk system of the host server, the disk I/O wait is too high on that server. you got bad neighbors that keep the disk too busy.
If static files wont load fast, it should be the case. when you do strace use "-T -tt" option to print timestamps.
 

cmanns

Well-Known Member
#3
I think it is a issue with the disk system of the host server, the disk I/O wait is too high on that server. you got bad neighbors that keep the disk too busy.
If static files wont load fast, it should be the case. when you do strace use "-T -tt" option to print timestamps.
thanks for the timestamp option.

I've spend the past 4 hours stracing and such, I changed all tables to Myisam (was posts/topics/etc as innoDB, and I saw innodb log file error POSSIBLE CORRUPTION***!!! in mysql log, thats all.

So after forcing a clean backup, removed db, put copy of db from older db dir backup, dropped db (to clean innodb out) reimported.

Still didn't work, no innodb errors. I then put all as myisam, some of the tables I noticed in the strace "timing out" etc, yet I could select stuff from those tables in phpmyadmin/mysql cmd but yeah anyways bout an hour ago I

Set the site to disabled in litespeed since cpanel will rebewt mysql, restarted mysql and flushed tables to clear out the database in mem.

I then went to the db dir and executed myisamchk
it didn't work, also filled tmp the first time on the 800k row table lol.

myisamchk --force --medium-check --tmpdir=YourDir --recover *.MYI

restarted mysql and litespeed to clear the domain/subdomain block, site loaded instantly. refreshed, php execution is now nearing less then .02 :)

Apache also faced same issue, what was odd is the site would suddenly work, then go slow. The forums visitors started saying it's our fault, client's not mad hehe not our issue!
 

mistwang

LiteSpeed Staff
#4
So, it is indeed a MySQL DB problem. It requires "--force --medium-check" option to really fix it? Not familiar with MySQL commands.
 
Top