There is often a need when building a website to automate background processes, for example:
- Sending emails: many sites don’t immediately connect to an SMTP server when a user fires off an email request, but rather store the email in the database which is processed by a background process, thus decoupling the frontend from the SMTP server. If the SMTP server goes down, your users can still register
- Offline processing: let’s say your site has a social element to it, users have real-time news feeds that get updated whenever their friends do things on your site. By updating these feeds offline there’s no need for users to wait until all 1,000,000 of their friends’ feeds have been updated before completing a request to post a simple comment.
There are plenty more examples, in fact as a rule of thumb it’s best to do everything that doesn’t have to be done immediately on a HTTP request later in the background. It also helps manage peak traffic, when your site is overloaded, just don’t run the background tasks until the peak has gone. Ok, so news feeds aren’t bang up-to-date for a short period, but at least the site works. So how does one go about automating all these background tasks?
This is probably the first port-of-call; implement your background task in PHP, thereby enabling you to use the database abstraction layer and all the other libraries in your application, and set CRON to run the script every minute.
Great – everything works perfectly, that is until the data in your database grows so big that the tasks start taking longer than one minute and you end up with lots of tasks all running at the same time – and potentially all processing the same data! Ooops, the boss just got his registration email ten times.
So you could implement some sort of locking mechanism to prevent multiple instances but it’s all starting to get a bit fragile with lots of interdependencies – all you want to do is run a simple script!
Write a daemon in PHP
So it’s back to the drawing board – what is needed is a daemon that will sit in the background and run as script continually, restarting it immediately after it finishes. You could write a daemon in PHP – this would enable you to write a daemon which would interface well with your existing PHP scripts, there are even some good PHP packages which make writing a daemon in PHP easy.
So, full of enthusiasm you launch into writing your own daemon in PHP. A week or so later, after reading all about process control, (and probably pulling large chunks of hair out in the process as yo realise it wasn’t as simple as you thought), you finally have something that works. You deploy it to your website, sit back and feel smug.
After a week your boss rages into your office demanding to know why nobody has registered in the past week. Red faced, you go to check your PHP daemon that sends emails form the email queue is still running – and it isn’t. If you haven’t already been fired then your next step would probably be to find out what happened. Looking through your PHP error log you find a “FATAL ERROR” from your daemon script, dated exactly one week ago.
PHP can throw fatal errors for all sorts of reasons, and there’s no way to recover from them. Also, PHP is renowned for its rather flaky memory management – whilst this is fine for web scripts that render a page and then end, it makes it highly unsuitable to write applications.
Enter The Fat Controller
How can we solve the problem? We need to run PHP scripts as a daemon. I faced this problem one year ago and I solved it by writing The Fat Controller. It is a program written in C that runs a daemon and can continually run PHP scripts. As it is written in C, it is highly stable and can run for months or years without problem. As the daemon runs separately from the PHP scripts, no matter what happens in the PHP script, it does not affect the Fat Controller.
The Fat Controller is also very flexible, easy to install and configure and supports running multiple instances of a PHP at the same time to achieve parallel processing. You can even configure it to dynamically control the number of parallel processes dependent on how much work there is.
You can read more about The Fat Controller here http://www.4pmp.com/fatcontroller/