Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision Next revision Both sides next revision | ||
litespeed_wiki:cache:litemage2:crawler [2018/07/27 15:37] Eric Leu created |
litespeed_wiki:cache:litemage2:crawler [2019/10/23 16:02] Eric Leu [More Options] |
||
---|---|---|---|
Line 8: | Line 8: | ||
- SiteMap: Prepare your site's sitemap, e.g. ''<nowiki>http://magento2.com/sitemap.xml</nowiki>'' | - SiteMap: Prepare your site's sitemap, e.g. ''<nowiki>http://magento2.com/sitemap.xml</nowiki>'' | ||
- | ===== How to Use Crawl script===== | + | ===== How to Use the Crawler Script===== |
- | [[ | Download from here]] | + | -[[https://www.litespeedtech.com/packages/litemage2.0/M2-crawler.sh | Download from here]] |
- | + | - Change the permissions so that the file is executable: ''chmod +x M2_crawler.sh'' | |
- | Change the permissions so that the file is executable: | + | - Run the script: ''bash M2-crawler.sh SITE-MAP-URL'' |
- | ''chmod +x cachecrawler.sh'' | + | |
- | + | ||
- | ==== Crawl Desktop&mobile share same theme==== | + | |
- | ''sh M2-crawler.sh SITE-MAP-URL'' | + | |
==== More Options==== | ==== More Options==== | ||
- | * To get help: ''sh M2-crawler.sh -h'' | + | * ''-h, --help'' Show this message and exit |
- | * To change default interval request from 0.1s to custom NUM value: ''sh M2-crawler.sh SITE-MAP-URL -i NUM'' | + | * ''-m, --with-mobile'' Crawl mobile view in addition to default view |
+ | * ''-c, --with-cookie'' Crawl with site's cookies | ||
+ | * ''-b, --black-list'' Page will be added to black list if html status error and no cache. Next run will bypas page | ||
+ | * ''-g, --general-ua'' Use general user-agent instead of lscache_runner for desktop view | ||
+ | * ''-i, --interval'' Change request interval. "-i 0.2" changes from default 0.1s to 0.2s | ||
+ | * ''-v, --verbose'' Show complete response header under /tmp/crawler.log | ||
+ | * ''-d, --debug-url'' Test one URL directly. "sh M2-crawler.sh -v -d http://example.com/test.html" | ||
+ | * ''-qs,--crawl-qs'' Crawl sitemap, including URLS with query strings | ||
+ | * ''-r, --report'' Display total count of crawl result | ||
+ | Example command: | ||
+ | * To get help: ''bash M2-crawler.sh -h'' | ||
+ | * To change default interval request from 0.1s to custom NUM value: ''bash M2-crawler.sh SITE-MAP-URL -i NUM'' | ||
+ | * To crawl with cookie set: ''bash M2-crawler.sh -c SITE-MAP-URL'' | ||
+ | * To store log in ''/tmp/M2-crawler.log'': ''bash M2-crawler.sh -v SITE-MAP-URL'' | ||
+ | * To debug one URL and output on screen: ''bash M2-crawler.sh -d SITE-URL'' | ||
+ | * To display total count of crawl result: ''bash M2-crawler.sh -r SITE-MAP-URL'' | ||
+ | * Use multiple parameters at the same time is allowed | ||
===== How to Generate a Sitemap===== | ===== How to Generate a Sitemap===== | ||
- | The Sitemap module is build-in for generating a sitemap in Magento 2, and it's fast. | + | Magento 2 has a builtin module for generating a sitemap and it's fast. |
==== Enable sitemap ==== | ==== Enable sitemap ==== | ||
- | Navigate to Magento admin page -> Stores -> Settings -> Configuration -> Catalog -> XML Sitemap \\ | + | Navigate to **Magento Admin > Stores > Settings > Configuration > Catalog > XML Sitemap** |
- | {{:litespeed_wiki:cache:litemage2:m2-4.png?600|}} \\ | + | {{:litespeed_wiki:cache:litemage2:m2-4.png?600|}} |
- | Set Generation Settings Enabled to ''Yes'' \\ | + | Set **Generation Settings > Enabled** to ''Yes'' |
{{:litespeed_wiki:cache:litemage2:m2-5.png?600|}} | {{:litespeed_wiki:cache:litemage2:m2-5.png?600|}} | ||
- | ==== Configuring a single sitemap for all storefronts ==== | + | ==== Configuring a Single Sitemap for All Storefronts ==== |
- | Navigate to Magento admin page -> Marketing -> Seo & Search -> Sitemap | + | Navigate to **Magento Admin > Marketing > Seo & Search > Sitemap** |
- | - Click **Add Sitemap** button | + | - Click the **Add Sitemap** button |
- | - Enter value | + | - Enter values |
- | * Filename: ''sitemap.xml'' | + | * **Filename**: ''sitemap.xml'' |
- | * Path: ''/'' | + | * **Path**: ''/'' |
- | - Click **Save & Generate** button | + | - Click the **Save & Generate** button |
{{:litespeed_wiki:cache:litemage2:m2-2.png?600|}} \\ | {{:litespeed_wiki:cache:litemage2:m2-2.png?600|}} \\ | ||
- | If all went well, a sitemap.xml file will generated in your magento 2 document root. | + | If all went well, a ''sitemap.xml'' file will have been generated in your Magento 2 document root. |
+ | |||
+ | ===== Crawl Interval ===== | ||
+ | How often do you want to re-initiate the crawling process? This depends on how long it takes to crawl your site and what you set for Public Cache TTL. | ||
+ | |||
+ | The default TTL is one day(24hr). Maybe, for example, you'd like to run the script by cronjob every 12 hours instead. | ||
+ | |||
+ | E.g. This will run twice a day, at 3:30am/15:30: ''30 3/15 * * * path_to_script/M2_crawler.sh SITE-MAP-URL -m -i 0.2'' | ||
+ | |||
+ | Note: You can also use [[https://crontab.guru/|online crontab tool]] to help you to verify the time settings. | ||
- | ===== How to Verify ===== | + | ===== How to Verify the Crawler is Working ===== |
- | By using [[https://developers.google.com/web/tools/chrome-devtools/ | the browser developer tool]], you should see ''X-LiteSpeed-Cache: hit,litemage'' at the first view \\ | + | When using [[https://developers.google.com/web/tools/chrome-devtools/|the browser developer tool]], load a previously uncached page. You should see ''X-LiteSpeed-Cache: hit,litemage'' on the first view. |
{{:litespeed_wiki:cache:litemage2:m2-3.png?600|}} | {{:litespeed_wiki:cache:litemage2:m2-3.png?600|}} |