CacheWarmer unable to crawl full website

#1
Trying to configure LiteMage CacheWarmer.

Cron set to run every 15 min
Cron timeout set to: 13 min

Flushed Magento Cache, flushed all LiteMage cache

Waiting several hours, the current position counter moves near to 315 and then resets to zero.

Current Position | List Size
153 | 2898

I am not sure why CacheWarmer is unable to fully crawl the website.

Any help please?
 

Lauren

LiteSpeed Staff
Staff member
#5
whenever your cache warmup settings change, or you flush the cache. the crawler will restart from beginning.
Cron set to run every 15 min -- how do you do that? default config.xml is set to run every 10 minutes. unless you modified the config.xml, otherwise it's every 10 minutes. so you better set the crawler to run for 8 minutes, if you want to make it faster, set threads count higher. next round it will continue from where it is left.
You can turn on debug log, or just watch it from cache management bottom section, it shows the progress and tells you why it stopped.
another possibility is that line 315 contains a bad url, and always fails at that url.
 
#6
I set the following cron job to run at 15 min interval
/bin/sh /home/user/public_html/cron.sh cron.php

isn't this the cron that handles warmup?

cache management bottom gives one messages about reaching max timeout limit and stopping there.
 

Lauren

LiteSpeed Staff
Staff member
#7
No, that cron is the cron scheduler, not individual module's cron task. module's cron is defined in module's config.xml:
<crontab><jobs>...<schedule> you can google more on how magento cron works.
So set it to 8 minutes.
After it reaches max timeout, let's say it runs for 8 minutes then stop, in 2 minutes, a new round start, it will continue to crawl. If it does not start in 2 minutes, then something is wrong with the cron scheduler. sometimes it will run and exit saying server loads too high.

We did this way is to give user flexibility to set up the crawler, to avoid when crawler runs it occupies all server resources. So you can control the interval/threads/load limit to schedule the crawler.
 
Top