Archive

Posts Tagged ‘cache_lite’

PEAR Cache_Lite – efficient group cleaning

September 27th, 2009 Nick No comments

After using PEAR Cache_Lite for a while, we began to notice that as traffic increased the web servers spent more and more time thrashing their discs. On closer inspection we noticed that the servers were pretty much constantly parsing the entire cache directory structure.

Whenever you call Cache_Lite::clean() to remove a group of cached elements, it parses the entire cache directory structure looking for cache files which have the correct group hash in the filename. This was problematic for us because we stored a lot of data in groups, for example with messages – each page of messages for each user was stored in one group. Whenever someone sends a message, the system then deletes the cache group containing the recipient’s messages. As the cache directory structure increased in size, it took longer and longer to parse, and with increasing traffic the web servers were soon doing nothing but parsing the cache directory.

The solution I came up with was to prepend the name of each group with a number which was also cached. So when a request arrives for a cached item in group “messages”, the cache system looked for the cached group identifier number and prepended it to the group name, resulting in an internal group name like “1234_messages”.

The overhead is an extra cache “get”, but the advantage is that in order to expire a whole group you just have to increment the identifier number by one, (get, increment, save). So when the group is accessed again, the internal group name becomes “1235_messages”, which is not yet set, and so the application can regenerate the cache.

In my opinion this additional “get” is a price worth paying, especially as it’s a relatively very quick operation, and the time saved expiring a group is many times faster.

Finally

You might be thinking to yourself, “what about all those expired cache files just left on the disc?”. Well, we set a CRON job to run every day and delete all files older than 3 days. As none of the caches lasted longer than three days this was a safe duration.

In fact, we don’t use disc caching anywhere near as much as we did, now we use Memcached for most things, but for small and often used caches (such as IDs) we still use the disc cache as it’s by far the fastest.

PEAR Cache_Lite – preventing stampeding

September 3rd, 2009 Nick 3 comments

The PEAR Cache_Lite package is an excellent caching system; lightweight and fast, however when put into use on a high-traffic website a few issues came to light. The first problem we hit was stampeding.

What’s stampeding?

Stampeding is the situation when a request, let’s say from User1, arrives for a cached item that has expired. The cache system returns boolean false and the process of rebuilding that cached data begins, calling the database, formatting the data and so on.

If, during this process of rebuilding the cached data, another request arrives for the same cached item, let’s say from User2, another process of rebuilding the cached data begins. This is because the process started by User1 has not yet finished and so the cache system still returns boolean false when requested for the cached item.

So now we have two processes running, regenerating the same cache item. The situation can get out of hand if more and more requests for the same cache item arrive – causing the load on the web server or database to increase, and everything to potentially grind to a halt.

The solution

What’s required is for the cache system to know that a particular cache is being regenerated and therefore return the old cache until the new data has been regenerated. Thankfully this can be achieved very simply with the addition of just one extra line of code into the Cache_Lite class, Lite.php.

The trick is to touch() the cache file immediately after realising it has expired in the Cache_Lite::get() function. After touching the file, the get function will return false and the calling code will regenerate the cache data.

@touch($this->_file);

By touching the file, the modification time of the cache file is set to the current time and therefore all subsequent requests will think the cache is still valid and return the old data. Once the first process has regenerated the data, it saves it and the cache file once again contains up-to-date data.

Some finishing touches

By touching the cache, processes immediately following the one which is regenerating the fresh data will return out-of-date data, albeit by a matter of seconds – which in most cases really won’t matter nor be noticed.

If, however, something were to happen to the process regenerating the data, such as an uncaught exception, database timeout, etc that it would fail and not save the cache, then the old cache will be valid until it expires again – so it will have effectively been valid for twice its intended lifetime.

We can limit this by setting the modification time in the touch command to be the current time, minus the cache lifetime, plus 60s – which would mean that if the regenerating process were to fail, the cache would only be valid for another 60s.

@touch($this->_file, time() - abs($this->_lifeTime) + 60);