Content-Aware Caching

J.T.

Well-Known Member
#1
Hi,

I'm looking at options to save cached versions of otherwise dynamically created pages, but only to those requests that make sense.

It's for an online shop so once a product is added to the basket for example, we need to stop caching. We also show "recently viewed products" in a sidebar, so once stuff like that is populated, we need to switch back to dynamic.

In other words, for search engine crawlers and first time visitors, I want blazing fast load times. Subsequent page loads can be dynamic.

LSWS's caching seems limited by headers and not aware of the contents of the request. It's nice to say don't cache if a cookie is involved, but better would be if the cookie contains "cache=0". Like varnish for example, it would be great if it could look inside the content, be content-aware.

Same with query string matching. We can say don't cache if there's a query string. Better would be if we can say don't cache if there's a query string saying cache=0. It would make LSWS caching a lot more flexible and powerful.

The e-commerce engine creates a cookie on the first load and almost all pages use query strings. I don't have enough clever data to match caching policies against.

Session based policies would be great too.

Is this on the roadmap already?
 

mistwang

LiteSpeed Staff
#2
You can make caching more flexible like you described with Rewrite Rules by setting environment "Cache-ctrl", like

RewriteCond ${QUERY_STRING} cache=0
RewriteRule .* - [E=Cache-control:no-cache]

You can use regular Cache-Control directives.
 

J.T.

Well-Known Member
#3
But that controls the browser cache, doesn't it?

I mean serving a static HTML copy of otherwise dynamic content, so we save opcoding and DB querying those pages before sending the HTML to the browser. Like Squid/Varnish or reverse proxies etc. I believe Litespeed Load Balancer offers it too if I read the feature list correctly.

For a first time visitor, even if I set the cache-control headers, then it still only comes into effect on the second page load. I want it so that Litespeed serves a static copy to those.

Basically, I want this but without Varnish, and with LSWS:

http://www.kalenyuk.com.ua/magento-performance-optimization-with-varnish-cache-47.html

Based on the user's history/actions, it knows when to set cookies/headers/query strings telling Varnish to show from cache or not to show from cache.
 

mistwang

LiteSpeed Staff
#5
But that controls the browser cache, doesn't it?
No, it control the server cache behavior.

You can configure LSWS to use cached page by default, and turn off cache with rewrite rule when a page should not be cached.

Maybe it is not exactly what you want, I think we probably need to add a special "Cache-control" directive to tell LSWS when to update a cached page with new content.
 

PSS

Well-Known Member
#7
Where is the manual/FAQ for cache? Specifically: what it exactly does/help with dynamic (php) documents and how, and how to tune it based on what you serve?
 

J.T.

Well-Known Member
#8
I will be revisiting our options in this department again and like PSS, also wonder where best to go for documentation. As we have both a VPS and an Enterprise license, it would be nice to have access to detailed documentation on exactly how to accomplish this.
 

J.T.

Well-Known Member
#10
Thanks. So it seems like for this purpose it's pretty much limited to query string, effectively. Can't look in cookies or sessions with this so the e-commerce system needs to add a url parameter if it doesn't want LSWS to use a cached page to a specific visitor.

That's a start but doesn't give many options. I can make it work for a very small subset of pages.

Ideally, like Zend Server, we can add session parameters as part of the condition, see condition two in this example:

http://files.zend.com/help/Zend-Server/working_with_page_caching.htm

That way, we can show a dynamic page to logged in users, to those who have added stuff to their cart etc. And show a static page to those who haven't done anything yet that personalises the page. I put that on my X-Mas wish list... For 2010 ;) Please!
 
Last edited:

NiteWave

Administrator
#11
Can't look in cookies
yes, lsws can. for example,
Code:
RewriteCond %{HTTP_COOKIE} username=guest
RewriteRule /index.php - [E=Cache-Control:max-age=600]
if uasename=guest exists in cookies, lsws cache /index.php page for 600 seconds(10 minutes).
for subsequent requests:
if there is uasename=guest cookie in request header, lsws return the cached index.php; if not exist, lsws will pass the request to lsphp5, and return the user output of lsphp5.
 

J.T.

Well-Known Member
#12
I stand corrected, that's nice to know. Makes it a lot more flexible than QUERY only.

SESSION would still be ultimate but at least now we can simply drop a cookie parameter that tells us whether or not LSWS should show a cached page or a live one.
 

J.T.

Well-Known Member
#13
Do these rules only work after we configure Server > Cache or VHost > [VHOST] > Cache in the admin GUI?

And we simply put them in htaccess file just like other Mod_Rewrite rules, right?

And how can we verify whether it's working, does it output certain headers we can look for that indicate this comes form LSWS's cache?

Lastly, can we do something like this, to only trigger if there isn't a Cookie with any value?

Code:
RewriteCond !%{HTTP_COOKIE} username=*
RewriteRule /index.php - [E=Cache-Control:max-age=600]
I really do suck at mod_rewrite... Thanks for your help.
 
Last edited:
#14
rtight, it works in .htaccess, just like other rewrite rules.

currently, no special response header to indicate it's from cache.

however, it's easy to verify if it's from cache or not. for example,
PHP:
<?php
header('CurrentTime: '.gmdate('D, d M Y H:i:s', time()).' GMT',true);

echo "time()=" . time() . "<br>";
echo "date()=" . date(DATE_RFC822) ."<br>";
?>
the output of this page changes for every visit if no cache.
with cache enabled, the page has no change until cache expire 10 minutes later.

the rewrite rule should be:
Code:
RewriteCond %{HTTP_COOKIE} !username=
RewriteRule /index.php - [E=Cache-Control:max-age=600]
 

eva2000

Well-Known Member
#15
the rewrite rule should be:
Code:
RewriteCond %{HTTP_COOKIE} !username=
RewriteRule /index.php - [E=Cache-Control:max-age=600]
Adding my questions here too

1. how would you do this at server level - for all virtualhosts on a WHM/Cpanel apache virtualhost configured server running litespeed ? for just specific php pages and/or for all php served pages on all virtualhosts ?

2. would that rewrite url not for index.php not cache index.php?page=variable ? or with friendly urls index.php/variable ?

3. In above example if cookie has guest in it cache index.php, would you need to enable these 2 options for it to work ? Cache Request with Cookie and Cache Response with Cookie ?

4. Is it still true this caching doesn't work with vBulletin as per statement at http://www.litespeedtech.com/support/forum/showpost.php?p=13751&postcount=6 or that is old outdated info ?

I read somewhere i'd need to compile apache with mod_cache, in pre main include add something like

CacheRoot /home/lswscache/
CacheEnable disk /

but how do you only cache guest/visitors without cookie then ?

thanks
 
Last edited:
#16
1.
it's possible to cache dynamic page(for example php) at server and each vhost level, without using rewriterule. This thread is talking about cache with rewriterule, which is litespeed specific feature and very useful in practice.

2. cache the original URI -- i.e., the URL in browser's address.
if "Cache Request with Query String"=Yes, then will cache page "index.php?page=variable"; otherwise won't cache it.

the rewriterule
Code:
RewriteRule /index.php - [E=Cache-Control:max-age=600]
won't cache "index.php/variable", since it's a different URI.
however, if the rewrite rule is
Code:
RewriteRule /index.php(/.*)? - [E=Cache-Control:max-age=600]
then both /index.php and /index.php/variable will be cached, since both matches the cache condition.

3.right, need set "Cache Request with Cookie"=Yes

4."how do you only cache guest/visitors without cookie"
Code:
RewriteCond %{HTTP_COOKIE} ^$
RewriteRule /index.php - [E=Cache-Control:max-age=600]
 

eva2000

Well-Known Member
#17
1.
4."how do you only cache guest/visitors without cookie"
Code:
RewriteCond %{HTTP_COOKIE} ^$
RewriteRule /index.php - [E=Cache-Control:max-age=600]
thanks mate good info - wish documentation would explain and outline this stuff :)

So for vbulletin usage, if i compile apache with mod_cache support, add into pre main include

CacheRoot /home/lswscache/
CacheEnable disk /

litespeed would cache all files in all virtualhosts without cookies (the default settings) ? is that the same (for index.php) as specifying in htaccess as below

Code:
RewriteCond %{HTTP_COOKIE} ^$
RewriteRule /index.php - [E=Cache-Control:max-age=600]
?

So index.php without cookies, would be cached for guest/visitors fwithout this rewrite anyway ? Same with rest of vB php files for guest/visitors even with this rewrite rule would be cached, as default is to cache without cookies ?
 
#18
for vBulletin, unfortunately, every guest visitor will be set cookies, so it'll not working as you expected. caching vBulletion for guest users is a bit more complicated, but it's possible under litespeed. In fact, we've tried to cache this vBulletin (litespeed suport forum) for guest visitors for a few weeks and looks working well. We can open an new thread to discuss our implementation to cache vBulletin under litespeed.
 

J.T.

Well-Known Member
#19
Thanks for the detailed examples, that's really helpful. Last questions, I think!

1. If we put those rules in htaccess, do we also have to turn caching on in the admin?

2. In the admin > Server > Cache I have Storage Path /home/lswscache - which permissions do you recommend? The VH I'm testing caching with seems to run as lsadm

I tried yesterday, before reading your post, like this:

Code:
RewriteRule /index.php - [E=Cache-Control:max-age=120]
So no conditions, just all-out caching. It didn't seem to work and I think it's to do with URL rewriting.

3. Would a rule for www.domain.org/index.php also cover www.domain.org/ ? Probably not.

4. And if I have a URL like www.domain.org/product1.html which is in fact a rewritten URL for www.domain.org/products/1 do I need to make the caching rewrite URL to match the original, or the rewritten URL?

5. Last question, if we have an index.php page which outputs "Hello NiteWave, welcome to our site" because it recognises it was you from last time, if it then caches that page, with that text, will it show exactly that to other users,with your name? I guess it will, and that's why we need to look for cookie values, for example.

After piecing together the bits of documentation and with your help here, I finally get to grips with this, thank you very much. I also happen to have another vBulletin site on a VPS with Litespeed, but it seems the license for that server doesn't include these caching options.
 
Last edited:
#20
1. yes. That's where to define your cache policy. Here's an example for litespeed native configured vhost:
Cache Request with Query String:Yes
Cache Request with Cookie:Yes
Cache Response with Cookie:Yes
Ignore Request Cache-Control:Yes
Ignore Response Cache-Control:Yes

2. the user running litespeed process will create and fetch cache from cache directory -- for example "nobody". set the cache folder owner to nobody, and permission to 700, should be ok.

3.tested, the answer: "/" is covered. the condition is "index.php" is the vhost's index page. this also true: if you cache "^/$" only, and index.php is the index page, access / will result access to /index.php, then cache "/" covers /index.php

4.the original URL. Note: lsws only cache dynamic page(php etc). if product1.html is a pure static html, it won't be cached.

5.yes, it will.
 
Top