Feature Request: Memcached context

Marcus

Well-Known Member
#1
How about having a context that uses Memcached? Requests could be forwarded on to a Memcached cluster, in a similar way to the memcached module in NginX / Apache.

Apart from caching whole pages, when SSI's are fully implemented, it would allow for the quick piecing together of page parts using Memcached as well as on disk but without the need for PHP/Ruby etc.

I'm currently doing this with NginX, but since Litespeed's handling of dynamic content is quicker, I'd rather use just one webserver if possible (and would prefer it to be Litespeed). ;)
 

mistwang

LiteSpeed Staff
#2
We will use our internal cache implemented in 4.0. it will be as fast as memcached, especially when used with SSI.

What can Nginx do with memcached? as page cache storage? or object/page parts cache?
 

Marcus

Well-Known Member
#3
With NginX you can fetch pages, and if you use SSI, objects/parts. You can just say under /defined/uri it forwards the request to a memcached server or cluster (so you can have more than one server with information incase one goes down). It allows for a failsafe if there is no object in the cache, to allow for dynamic generation / fetching from disk (or proxying etc). It also allows you to set the memcached key based on various variables, including the URI. It's not used to cache content, though. You can only get(), not set() (at this point in time, though there are plans to extend it).

The thinking behind using Memcached was not so much for using it as a basis for your caching engine, which would include set()'s. I was thinking of it purely in a get() scenario, in the same way that some requests might be proxied or sent to an lsapi.

There are times when it might be most appropriate to control the cached content outside of the webserver (or at least have the option to do so). There are many applications that integrate with Memcached, including database functions and obviously scripting engines.

For example, you could have a database that has a trigger to update a Memcached entry when there's an update, which puts the relevant cached data directly into Memcached from the database. Depending on how things are defined, you wouldn't then need to call PHP/Ruby etc to update the cache.

Using memcached (possibly combined with SSI/ESI), it would be possible to serve data cached in memory across many servers rather than just one in a portable way, so that the data is the same across all servers. Sometimes it might be important that a particular object on a page appears the same across all webservers, whilst other parts are ok if there's a few minutes delay between caches.

Also, if you have very large volumes of data (multiple gigabytes for example) that you're wishing to serve from memory, and don't want to do a uri-based load-balancing solution before you hit the webserver, balancing the cached data using memcached is another option (basically what I understand was the motivation behind developing memcached in the first place, though I think it used something like PHP in front of it).

Of course there are many ways to do these things which don't involve memcached, but considering the number of applications that are integrating with it, I feel it's worth you giving consideration to it. For some applications it's a useful bit of flexibility.

I'm currently developing a framework that uses both NginX and Litespeed because NginX has all the memcached/SSI features as well as very flexible use of variables in the configurations and Litespeed is quicker (in my tests) with PHP. I'd prefer to use just one webserver, though, and am looking forward to having SSI implemented on Litespeed.

Here's a link to the memcached module in NginX: http://wiki.codemongers.com/NginxHttpMemcachedModule
 

mistwang

LiteSpeed Staff
#4
So, nginx cannot populate the memcached, can only get data from it?
If it is used as page cache, it won't be any better than web server internal page cache, actually should be not as efficient as the web server need to fetch the page from memcached instead of serving it directly. And memcached need to be populated with another application.

Memcached cannot store content in persistent storage.

Given the amount of memory can be installed to one server, I do not see much advantage of using a memcached cluster if one server can handle them all. If web server can handle cache internally, it won't need to do the extra (remote) IPC with memcached.

We will add a memcached like interface to our cache implementation in 4.0, so it could be used as a disk storage backed object cache as well.

We also plan to extend our PHP LSAPI implement by adding ESI like function into PHP. LSWS can cache page part generated by PHP, when PHP script want to use a page part, just tell LSWS the part ID, if LSWS has it, then serve it directly, if not, tell PHP to generate the part again and cache it. The extra IPC between PHP and memcached can be avoided.
 

Marcus

Well-Known Member
#5
That's correct, it can't (yet) populate the memcached.

I definitely agree on an internal cache (that's well designed) being quicker than storing things in memcached.

If one server can handle all the cache, again I agree that there is no advantage in using memcached - at least in the case of a 'pull' cache (by this I mean that the caching is controlled by the web-server). However, there could be an advantage if one server can't cope, or if you want a 'push' cache (where the cache is only changed when the data changes).

It's easy enough to develop a proxy/load-balancing solution which could use single-server caches on each server, but with some people may prefer to do this with memcached.

I like the idea of extending the PHP LSAPI. Do you have any idea of when that might be available for testing? Will that potentially involve many PHP calls with one request, or would one PHP call be used for each request (perhaps with requests being sent back and forth between the webserver and PHP)? One problem I've been thinking about is that pages may be made up of parts where there would be multiple parts which needed to be re-generated. Using SSI with <--# include virtual... would ordinarily involve many PHP calls per request. This reduces the benefit of using SSI.

Will this API include PHP functions, or just the generation of SSI-like code?
 

mistwang

LiteSpeed Staff
#6
Our SSI implementation is very simple, just like the others, it won't be a full blown scripting language. We do that mainly for closing the gap on Apache compatibilities.

Using SSI+cache with other scripting language may work well for some applications, like you said, it depends on cache efficiency. The good news is that we will continue to improve our PHP SAPI to reduce the overhead of calling into PHP engine. PHP engine is doing too much useless things now.

Our PHP ESI will be implemented as PHP functions, PHP just tell web server the object ID and content for the object, it actually only need to insert a few cache control command into the original data flow generated by PHP.
Code:
$ls_esi = ls_esi_begin_object( <part_id>, <TTL> );
if ( !$ls_esi )
{
      ... 
      original PHP code to generate the part 
      ...

     ls_esi_end_object( <part_id> );
}
ls_esi_begin_object() will return true when LSWS serve the object from cache.
We will start work on this after our 4.0 release. Maybe two months later.
 

Marcus

Well-Known Member
#7
That seems interesting.

When will the object be served - at the point when it is called? This would mean that all code would need to be printed directly if the object is to be placed in the right place, no? What happens if you are saving all the text of a page into a file to print/echo at the end, e.g. if you're using some kind of templating system? Is there any way to automatically insert code into a string, e.g.

----------------------
function generate_page ( $code )
{
print $header;
print $code;
print $fooder;
}

...

$ls_esi = ls_esi_begin_object( <part_id>, <TTL> );
if ( !$ls_esi )
{
...
original PHP code to generate the part
...

ls_esi_end_object( <part_id> );
}
$code .= ls_object ( <part_id> );

generate_page ( $code );
--------------------

or is it going to be restricted to just code that is generated/printed in the order that it's served?


A separate question: is there going to be any way to cache (in a compiled form) pages that are made up of SSI (/ESI) code? It would be great if pages didn't need to be parsed each time for SSI commands if they've not changed - something similar to the compiled code cachers (like eAccelerator) for PHP / other languages, and I'm thinking very much along the lines of the Cache Meta Language modules (more recently replaced by power magnet) for Lighttpd.
 

mistwang

LiteSpeed Staff
#8
The idea of ESI is not to send the cached object back to PHP but include it in the response body directly, so, it requires PHP code to generate the content sequentially and flush the output buffer before request the cached object.

SSI will be compiled into execution unit and cached by the server in our SSI implementation.
 

Marcus

Well-Known Member
#9
The idea of ESI is not to send the cached object back to PHP but include it in the response body directly, so, it requires PHP code to generate the content sequentially and flush the output buffer before request the cached object.
That's what I guessed - and makes the most sense.

SSI will be compiled into execution unit and cached by the server in our SSI implementation.
Great. Knowing your high standards, I guessed it would. I'm looking forward to testing it out.

Will there be a purely memory execution unit, or will it be possible to cache a pre-parsed SSI script on disk too?
 

muiruri

Well-Known Member
#12
Hi George,

When we upgrade our LSWS from 3.x to new version 4.x on our cpanel servers shortly, is there any advantage of having both memcached enabled (or compiled into apache with easyapcahe) plus your new internal cache feature built into LSWS 4.x? For example is it correct to assume we'll get double performance especially for database-read intensive websites?

regds...
 

mistwang

LiteSpeed Staff
#13
4.x cache is a HTML page cache for dynamically generated pages. Some pages are not cacheable, like page with cache-control header.
Memcached is flexible, can be used to cache DB query result, or the page.
if you use memcached to cache the same content , then you only need one cache.
 
Top