Magic curl option for fast recache and small lscache size

serpent_driver · Jun 3, 2022

Ich will check it (for you) if you are right.

AndreyPopov · Jun 4, 2022

developers add "magic" curl option

https://github.com/litespeedtech/lscache-opencart/commit/d7a085e56132308eec522d9b9d332e163163b9b2

serpent_driver · Jun 4, 2022

Hey come on!

This is neither magic nor special. It is just default! If you would be ready to learn and listen you would have real magic stuff. Your "magic" crawler script is a turtle!

AndreyPopov · Jun 4, 2022

serpent_driver said:

for you? may be ......... because you cannot nor want to understanding!

but for user of Opencart and LScache - it's really magic option!

serpent_driver said:

and again, again and again - this is not my crawler script. this is OFFICIAL LSCache for Opencart plugin script!

serpent_driver · Jun 4, 2022

So, it is time (for me) to build a new real magic one!

serpent_driver · Jun 5, 2022

Why didn't you add this parameter?

Code:

curl_setopt($ch, CURLOPT_NOBODY, true);

serpent_driver · Jun 5, 2022

I've just installed OpenCart and LScache module for OC. Horrible! This module must fastly updated!!!

AndreyPopov · Jun 5, 2022

serpent_driver said:

I already answered for you - ask developers! LiteSpeed Team!
on https://store.litespeedtech.com/store/submitticket.php
or on GoLiteSpeed Slack

serpent_driver · Jun 5, 2022

After OpenCart brought me to the brink of despair, here comes a Quick'n & Dirty but the fastest method of how to warmup your cache not only for OpenCart. This method is only for use in CLI. I have a version for PHP, but this version is not for free. To make warmup really fast, the default way of making requests is too slow. This default method works serial. Serial means one request after the other, but curl also support parallel requests and can make request up to 10,000 and more and all at the same time. To warmup the cache we don't such a high number, because to much requests costs too much load. 3 to 5 is a good number. For your information: Lscache plugin for Wordpress also works on this way, but has a bad configuration that makes warmup slow again....

The problem with this parallel method in CLI is, that a curl version higher 7.4x is needed for it and many hostings don't have this version installed. The alternative way is to run curl from local computer. curl offers a windows version and can easily be installed. This windows version works like server version in DOS shell and uses the same commands.

To get this method work we need a list of urls in a specific format. To get this list we use sitemap.xml function in OpenCart, but this needs an extension to generate sitemaps.

How to Do:

1.) Create a directory at choice anywhere on your server. It must not be OC directory, but it must be connectable with browser.
2.) Create a blank PHP file, place it into this directory and copy the code below into this file.

Code:

header("Content-Type: text/plain");

$sitemap = 'https://www.priazha-shop.com/sitemap.xml';

$content = file_get_contents($sitemap);

$xml = simplexml_load_string($content);

foreach ($xml->sitemap as $urlsElement) {

    $urls = $urlsElement->loc;
    $sitemaps = file_get_contents($urls);
    $xmls = simplexml_load_string($sitemaps);

    foreach ($xmls->url as $urlElement) {
        $url = $urlElement->loc;
        file_put_contents('sitemap.txt',  'url = ' . $url. "\n", FILE_APPEND);
    }
}

3.) Run this file in browser
4.) The script above generates a formated .txt file with all URLs from "Index" sitemap.
5.) Download this file
6.) Run the curl command below where this txt file is located. I've named this file sitemap.txt

Code:

curl --parallel --parallel-immediate --parallel-max 3 --connect-timeout 5 --http1.1 -k -I -s -X GET -H "User-Agent:Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:101.0) Gecko/20100101 Firefox/101.0" -H "Accept-Encoding: gzip, deflate, br" --config sitemap.txt

Parameters:
--parallel-max 3 // 3 parallel requests; Do not set a higher number than 5 to prevent too high load!!!!
Set User Agent at your choice
Add further custom headers with "-H" in front

That's it! Enjoy

##############################################
Additional information:

curl for Windows can be downloaded here:
https://curl.se/windows/

To check curl version, run curl command in CLI:

Code:

curl -V

It must be version higher than 7.4x for parallel support. Otherwise use version for Windows.

serpent_driver · Jun 5, 2022

AndreyPopov said:

Why me? I don't use OC, but I am wondering why you don't use it?!

AndreyPopov · Jun 5, 2022

serpent_driver said:

I use! and more than ten last changes in LSCache for Opencart developers add after my tickets!

serpent_driver · Jun 5, 2022

Hey, then you must be the King of OpenCart?

With my help you will be the God of OpenCart!

AndreyPopov · Jun 5, 2022

serpent_driver said:

curl - How To Use
--parallel-immediate
Added in 7.68.0.

--parallel-max
--parallel
Added in 7.66.0

serpent_driver said:

I hope

serpent_driver · Jun 5, 2022

Sorry, typo. Thank you for correction!

serpent_driver · Jun 5, 2022

AndreyPopov said:

But you will never be the God of Gods, because this rank is already taken by me!

AndreyPopov · Jun 9, 2022

serpent_driver said:

question to @serpent_driver

CURLOPT_NOBODY
true to exclude the body from the output. Request method is then set to HEAD. Changing this to false does not change it to GET.

why you provide method HEAD, but later set to GET?

PHP:

curl --parallel --parallel-immediate --parallel-max 3 --connect-timeout 5 --http1.1 -k -I -s -X GET

serpent_driver · Jun 9, 2022

This is not a HEAD request. HEAD means only HTTP Header and no message body, but message body is not HTML body. That is a big difference. A only HEAD request prevents to cache a page.

AndreyPopov · Jun 9, 2022

serpent_driver said:

please read again and more carefully :

Request method is then set to HEAD

Changing this to false does not change it to GET

serpent_driver · Jun 10, 2022

Again, request method has nothing to do with "CURLOPT_NOBODY". This curl option has affect on returning message body and depending on the value for this option the ouput of the request will be returned or not. For caching a page you don't need the return and that's why it is better to set value for CURLOPT_NOBODY -> true.

If you still want answers regarding request method and curl option please read curl documentation.

AndreyPopov · Jun 10, 2022

serpent_driver said:

you understanding all vice versa!

CURLOPT_NOBODY on true - SET request method of curl session to HEAD

serpent_driver said:

this is PHP option for curl session. you must read PHP documentation.

Magic curl option for fast recache and small lscache size

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member