Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
litespeed_wiki:cache:lscps:crawler [2018/07/24 15:25] Eric Leu [How to Use Crawl script] |
litespeed_wiki:cache:lscps:crawler [2019/10/23 17:41] Eric Leu [More Options] |
||
---|---|---|---|
Line 8: | Line 8: | ||
- SiteMap: Prepare your site's sitemap, e.g. ''<nowiki>http://prestashop-123/456_sitemap.xml</nowiki>'' | - SiteMap: Prepare your site's sitemap, e.g. ''<nowiki>http://prestashop-123/456_sitemap.xml</nowiki>'' | ||
- | ===== How to Use Crawl script===== | + | ===== How to Use the Crawler Script===== |
- | [[https://www.litespeedtech.com/packages/prestashop/cachecrawler.sh | DownLoad from here]] | + | [[https://www.litespeedtech.com/packages/prestashop/cachecrawler.sh | Download from here]] |
- | ''chmod +x cachecrawler.sh'' | + | |
- | ==== Crawl Desktop View==== | + | Change the permissions so that the file is executable: ''chmod +x cachecrawler.sh'' |
- | ''sh cachecrawler.sh SITE-MAP-URL'' | + | |
- | ==== Crawl Desktop and Mobile Views ==== | + | Crawl when desktop & mobile share the same theme: ''bash cachecrawler.sh SITE-MAP-URL'' |
- | ''sh cachecrawler.sh SITE-MAP-URL -m '' | + | |
+ | Crawl when desktop & mobile have different themes: ''bash cachecrawler.sh SITE-MAP-URL -m '' | ||
By default, in the Prestashop cache plugin Mobile View is DISABLED. To enable mobile view, navigate to **PrestaShop Admin -> LiteSpeed Cache -> Configuration** and set **Separate Mobile View** to ''Yes'' | By default, in the Prestashop cache plugin Mobile View is DISABLED. To enable mobile view, navigate to **PrestaShop Admin -> LiteSpeed Cache -> Configuration** and set **Separate Mobile View** to ''Yes'' | ||
Line 22: | Line 21: | ||
==== More Options==== | ==== More Options==== | ||
- | * To get help: ''sh cachecrawler.sh -h'' | + | * ''-h, --help'': Show this message and exit. |
- | * To change default interval request from 0.1s to custom NUM value: ''sh cachecrawler.sh SITE-MAP-URL -i NUM'' | + | * ''-m, --with-mobile'': Crawl mobile view in addition to default view. |
+ | * ''-c, --with-cookie'': Crawl with site's cookies. | ||
+ | * ''-b, --black-list'': Page will be added to blacklist if HTML status error and no cache. Next run will bypass page. | ||
+ | * ''-g, --general-ua'': Use general user-agent instead of lscache_runner for desktop view. | ||
+ | * ''-i, --interval'': Change request interval. ''-i 0.2'' changes from default 0.1 second to 0.2 seconds. | ||
+ | * ''-v, --verbose'': Show complete response header under ''/tmp/crawler.log''. | ||
+ | * ''-d, --debug-url'': Test one URL directly. as in ''sh cachecrawler.sh -v -d http://example.com/test.html''. | ||
+ | * ''-qs,--crawl-qs'': Crawl sitemap, including URLS with query strings. | ||
+ | * ''-r, --report'': Display total count of crawl result. | ||
+ | Example commands: | ||
+ | * To get help: ''bash cachecrawler.sh -h'' | ||
+ | * To change default interval request from 0.1s to custom NUM value: ''bash cachecrawler.sh SITE-MAP-URL -i NUM'' | ||
+ | * To crawl with cookie set: ''bash cachecrawler.sh -c SITE-MAP-URL'' | ||
+ | * To store log in ''/tmp/crawler.log'': ''bash cachecrawler.sh -v SITE-MAP-URL'' | ||
+ | * To debug one URL and output on screen: ''bash cachecrawler.sh -d SITE-URL'' | ||
+ | * To display total count of crawl result: ''bash cachecrawler.sh -r SITE-MAP-URL'' | ||
+ | |||
+ | NOTE: Using multiple parameters at the same time is allowed | ||
===== How to Generate a Sitemap===== | ===== How to Generate a Sitemap===== | ||
The Google Sitemap module is quite popular for generating a sitemap in Prestashop, and it's much faster than online generation. | The Google Sitemap module is quite popular for generating a sitemap in Prestashop, and it's much faster than online generation. | ||
Line 34: | Line 50: | ||
Download [[https://github.com/PrestaShop/gsitemap/archive/master.zip | gsitemap]]; then change the file name to ''gsitemap.zip''. | Download [[https://github.com/PrestaShop/gsitemap/archive/master.zip | gsitemap]]; then change the file name to ''gsitemap.zip''. | ||
- | Click the **Configure** button, then click ''xxx.sitemap.xml''(This is your SITE-MAP-URL). | + | Click the **Configure** button, you will see e.g. ''xxx/1_index_sitemap.xml''(This is your main SITE-MAP-URL, ). |
- | {{:litespeed_wiki:cache:lscps:prestashop-9.png?600|}} | + | {{:litespeed_wiki:cache:lscps:ps-10.png?600|}} |
==== SiteMap Online Generator ==== | ==== SiteMap Online Generator ==== | ||
Line 42: | Line 58: | ||
{{:litespeed_wiki:cache:lscps:prestashop-6.png?600|}} | {{:litespeed_wiki:cache:lscps:prestashop-6.png?600|}} | ||
+ | |||
+ | ===== Crawl Interval ===== | ||
+ | How often do we want to re-initiate the crawling process? This depends on how long it takes to crawl your site and what did you set for Public Cache TTL. \\ | ||
+ | Default TTL is one day(24hr). Maybe you can consider to run the script by cronjob every 12 hours.\\ | ||
+ | E.g. This will run twice a day, at 3:30am/15:30: ''30 3/15 * * * path_to_script/cachecrawler.sh SITE-MAP-URL -m -i 0.2'' | ||
+ | |||
+ | Note: You can also use [[https://crontab.guru/|online crontab tool]] help you to verify time settings. | ||
===== How to Verify ===== | ===== How to Verify ===== |