The Blowout Preventer – increasing HTTP performance by adaptive concurrency control

What is it?

The Blowout Preventer is an HTTP proxy that sits in front of a website’s load-balancer which:

  1. Ensures the website can never be overloaded by requests
  2. Guarantees all users a maximum response time
  3. Dynamically controls concurrency for optimum throughput

The Blowout Preventer can work with any web cluster - Apache, Nginx, IIS, etc

What does it do?

The Blowout Preventer is a proxy which accepts all incoming requests to a website, forwards those requests on to the website and relays back the responses to users. It dynamically controls the number of requests being concurrently processed by a website to achieve maximum throughput of requests. Just like ABS brakes on your car use feedback to limit the amount of force on the brakes to achieve maximum braking effect and prevent instability, the Blowout Preventer monitors the request completion rate to determine optimum concurrency.

When limiting concurrency, arriving requests are put into a queue and forwarded on to the website once the concurrency drops below the limit. However, before a request is placed into the queue, the Blowout Preventer calculates how long that request would have to wait in the queue, and if it is longer than a set maximum, then a static response is immediately sent saying “Sorry, we’re busy, please try again in a few minutes.”. This is much better than letting users wait and wait, or even worse, making them hit “refresh” and send another request.

As concurrency is limited to the optimum level, a website can never be knocked offline by being tipped into instability by a flood of requests – hence the name!

Who is it for?

We’ve all been there before: you build a website, users arrive, it gets slower and slower as more people use the site until it becomes unusable and might even crash. So what do you do? Bigger hardware? Optimise the code? Usually a combination of both is required. Then you notice the traffic spikes, the ones which might happen just once a month or just very occasionally when someone posts a link to your site on Reddit. Do you really want to invest in hardware and software for an event which might happen for just a few minutes, once in a blue moon? Do you really want to take the risk that your website could crash if your latest marketing campaign is more successful than you thought? What if you simply cannot scale the site fast enough to cope with the growing traffic? Even if you build a website which can scale to handle truly massive concurrency, you might well achieve better response times by limiting concurrency, as we’ll see later.

Preventing Blowouts

The root-causes vary from site to site, but ultimately it boils down to the number of requests being processed at the same time – concurrency. As the concurrency increases, the time needed to process each requests remains constant, up until a point, after which the time to process each request begins to increase, ultimately leading to a decrease in the overall rate at which requests are processed. Imagine filling a bucket with a hole in the bottom. Providing the rate at which the water pours out the bottom is greater than or equal to the rate at which water is poured into the bucket, the level of water inside the bucket will not increase. The same is true for websites – providing requests can be served at a faster rate than they arrive, then concurrency will not increase. However, if requests cannot be served quicker than they arrive, then the concurrency will increase, just like the level of water in the bucket. The increased concurrency exacerbates the problem by causing the time needed to process each request to increase, further reducing the rate at which the server can process requests. Ultimately this positive feedback leads to system instability and potential crashes – a blowout. This is what happens when your site gets swamped by users.

Clearly there is a need to prevent concurrency exceeding this critical maximum. Even if your site has plenty of headroom, a sudden influx of requests could knock you offline instantly. It also needn’t be a particularly large volume of users, imagine that a relatively small number of users just happen to each initiate a highly intensive operation, such as a large database update. This would effectively render the database offline, other users would then wait for it to become free and cause requests to quickly clog the system, possibly tipping it into instability.

As the Blowout Preventer constantly monitors request throughput and concurrency, it can detect such changes immediately and take action automatically before problems occur.

This is the critical difference between the Blowout Preventer and passive monitoring solutions.

Increasing efficiency

Concurrency limiting not only prevents blowouts, but also ensures your site is operating at its maximum efficiency at any moment, based on the current demands of the users. I’ll try and illustrate this with an example. Below is a graph showing how the throughput (requests per second) of one particular page of a website varies with increasing concurrency:

Request completion rate against concurrency

Apart from a couple of anomalies at concurrencies of 24 and 30, we can see that the throughput rises linearly with concurrency up until a concurrency of about 8, after which the throughput tends to level off. So we can assume that this particular page of this website scales well and we can quite happily allow the concurrency to go right up to 30 and maybe even beyond – right? Well, let’s see what happens now if we superimpose the average request times on the graph. The request time is the time taken for each request, i.e. the time users have to wait for the page.

Request rate and duration against concurrency

Again ignoring those two anomalies, we can see that up to a concurrency of 5, the request times remain constant with increasing concurrency. After this point, request times increase linearly with concurrency. You’ll notice that there is a point after which request times increase with increasing concurrency yet there is no noticeable increase in throughput. Surely it would be better to limit the concurrency before this happens? Absolutely it would, and that’s exactly what the Blowout Preventer does.

Adaptive concurrency control

No doubt you will have noticed that the above examples are very simplistic. Clearly the optimum concurrency for requests purely for static content will be much greater than requests instigating complex database write operations. So a one-size-fits-all approach to concurrency limiting isn’t going to work, we’d either have to set it so low and risk needlessly incapacitating our site some of the time, or set it too high and risk being pushed into instability by chance that all the requests want to do something terribly complicated. This is where the Blowout Preventer’s adaptive concurrency control algorithm comes in. If we model a website as a black box which processes HTTP requests, then for any given average of concurrent requests, there will be an optimum concurrency: if most requests are for static content then it will be high, and conversely if most of the requests initiate complicated calculations then it will be low. As the Blowout Preventer is a proxy in front of this black box, it keeps statistics on request times, concurrency and throughput and constantly adjusts the concurrency to what it calculates to be the optimum.

Project status

I started this project about three years ago and spent the first two years working on the adaptive concurrency algorithm. Using mathematical models built in Octave, I was able to simulate almost real-life conditions to refine the algorithm. This year I began building the proxy and now I am almost ready for the first tests. The proxy itself is written in C and uses an event-driven architecture, so handling massive numbers of concurrent TCP connections is no problem. As well as this, it is multithreaded with one thread for each CPU core to ensure that the Blowout Preventer cannot become a bottleneck itself!

Conclusion

This article has described what the Blowout Preventer is – an HTTP proxy which sits in front of a website, dynamically limiting concurrency to ensure the system never becomes unstable, to ensure optimum efficiency and guarantee that all users get a response within a specified waiting time.

By writing this article, I hope to find out what the level of interest is in the project, if anyone would be interested and willing to test an early version and also if there are any comments on the idea of adaptive concurrency limiting.

You can keep up-to-date with news and progress on Twitter:
twitter.com/BlowoutP

Posted in High Scalability | Tagged , , | Leave a comment

Fat Controller v0.0.5 released!

After a hiatus of almost two years, here’s the latest instalment of the Fat Controller. No fancy new features I’m afraid, just the odd bug fix and a tidy-up of some of the internals. All being well, this should make it to the v1.0.0 release.

Download: sourceforge.net/projects/fat-controller/files/

Project web: fat-controller.sourceforge.net

What’s next?
After the 1.0.0 release, I am thinking of creating a monitoring application with a GUI, so you can remotely communicate with and monitor a running instance. Another idea would be to allow one instance of the Fat Controller to handle multiple configurations, avoiding the need for multiple instances of the Fat Controller running simultaneously.

I’m pretty short of spare time, so it depends on interest in the program. As always, any new ideas, comments and suggestions for future versions are all welcome!

Posted in The Fat Controller | 2 Comments

Page onload Ajax request in Wicket

A new requirement arrived this week to display a load of statistics on the main page of the application. Simple enough, except that the collection of these statistics from the database took an unacceptably long time to load. As this was the main page, every user had to wait for the statistics to load after logging in before they could navigate to the page they wanted. Unfortunately simple caching wasn’t an option as the data must be up-to-date and more elaborate caching techniques were off the cards due to time restrictions on the implementation effort.

The solution I came up with was to render the page without the statistics but instead showing a message saying “Statistics loading, please wait…”. Once the page loaded, an Ajax request was fired which generated the statistics and replaced the message. Now the page loaded immediately and users did not have to wait for the statistics to load before navigating to another page.

However, implementing this in Wicket was not so straightforward, so I thought I’d write up my solution.

The first step was to create an implementation of AjaxEventBehavior which created a new instance of the statistics panel to replace the “Loading…” message component. So far all pretty basic.

AjaxEventBehavior loader = new AjaxEventBehavior("onload") {
	//...

Getting this Ajax event to fire on page load was a bit trickier. The normal way is to attach a Javascript event to the tag, or specify its onload attribute. Luckily, there is a way to do this in Wicket. All the pages in the application extend from a base class which itself extends Page. This base class provides standard layout so that all the pages in the application look similar, share the same HTML headers and so on. The associated HTML of this base class defined the tag, so in order to be able to access it in Wicket I added a wicket id, something like this:

<body wicket:id="body">

Great, so I can access the body tag – but now I’ve broken the hierarchy, all the components now need to be added to the body tag, not the page. Rather than rework the entire page class, it is possible to set the body element to be transparent in the hierarchy:

        body = new WebMarkupContainer("body"){
            @Override
            public boolean isTransparentResolver()
            {
                return true;
            }
        };

        add(body);

Now all I had to do was create a protected getter method to allow access to the body element and now I could add the Ajax behaviour from the main page.

Posted in Java | Tagged , , | 1 Comment

Service initialisation with automatic dependency based ordering

Situation

The Application Controller of our application initialises all of its services when it itself is initialised. Unsurprisingly, some services must be started before others; for example the Logging service must be started before the Configuration service, which must be started before the Persistence service, which must be started before the Authentication service and so on.

Currently, all this initialisation is done in one long initialiseServices() method, initialising each service in the correct order. If we create a new service then we just slot it into this method in the right place. Providing this order is not changed then everything is fine.

The problem

So what’s the problem? Well, apart from being a bit ugly, there is a practical problem that has arisen. Our client would like to be able to deploy parts of the application separately, which means we need to split it up into a core which rarely changes and separate modules which can be deployed independently. If one of these modules requires a new service, or one of its services becomes dependent on another, then the Application Controller will also need to be deployed. Not only will the client need to deploy the new module but also a whole new core. Also, the client must wait until both the module and the core builds become stable.

The solution

What would be ideal is if each module could tell the Application Controller which services it needs when the application starts. The Application Controller would then determine the order in which to initialise the services based on their dependencies. Cyclic dependencies would be detected upon startup.

A prototype

In order to demonstrate this, I’ve made a little mock-up. This is probably best understood by looking at the code, but I’ll try and explain it as best I can in words.

The idea is that all Services must extend an abstract base class; Service. The abstract Service class provides public method addDependency(Service service) which is used to specify the other services on which it is dependent. The sorting algorithm is not interested in services on which each service is dependent, but rather the services which depend on each service. This is because the graph of service dependencies is traversed depth-first, from the services without any other services on which they depend. Therefore, the addDependency(Service service) method actually calls a package-private addDependent(Service service) method on the passed service object.

Services are then added to a Service Manager. The services are initialised by calling initialiseServices() on the Service Manager which creates an instance of ServiceReactor which does the sorting, returning an ordered list of services which are to be initialised sequentially.

The ServiceReactor uses topological sorting to arrange the services based on the graph of dependencies.

The example I’ve provided sets up the simple graph as described here:
http://en.wikipedia.org/wiki/Topological_sorting

Note that I didn’t want to make a separate Service classes for each node, so I made an IdentifiableService which gets a unique integer passed to its constructor which is used to identify it as a separate Service. Of course in reality each Service would be defined by its own, separate implementation of the base Service class.

Feel free to download it and have a play with it. For example, try creating a circular dependency and you should get a DirectedCycleException. I know, it really is that much fun!

Download example:
ServiceReactor

Posted in Java | Tagged , , , | Leave a comment

The Fat Controller v0.0.4 released!

I’m sorry it wasn’t in time for Christmas, I hope everyone managed to have at least a little bit of fun on Christmas day without it – but finally it’s here – version 0.0.4! This version involved the rewriting of the entire logging system for sub-processes, so plenty of testing was needed which wasn’t always compatible with Christmas festivities, (oddly enough).

Download: sourceforge.net/projects/fat-controller/files/

Project web: fat-controller.sourceforge.net

Continuous logging

The main aim for this release was to fully support the “daemonise anything” point of the Fat Controller raison d’etre. Prior to this release, output from sub-processes was collected and then written to the log file only once the process had ended. This was bad for two reasons:

  • it only logs output on STDOUT, anything on STDERR is ignored
  • no good for long-running daemon processes as nothing is logged until it ends

The Fat Controller now continually monitors STDOUT and STDERR of all sub-processes and immediately writes anything to the log file.

The whole logging system can be re-initialised by sending SIGHUP to The Fat Controller. So if, for example, your log files get deleted due to log rotation, simply send SIGHUP and The Fat Controller will re-open file descriptors to the log files. This can be easily added to your log rotation mechanism.

So, what else is new in v0.0.4? Here’s a brief summary taken from the changelog:

ADDED –run-once
Also to support daemon processes, using the –run-once argument it is possible to tell The Fat Controller to, (cunningly) run a process only once and then end.

ADDED –test-fire
If this argument is specified then The Fat Controller initialises but does not actually run, i.e. it does not daemonise (if it is specified to be in daemonise mode) nor does it run any processes. It is useful combined with the –debug option when testing to check arguments have been correctly read and interpreted.

CHANGED Debug mode
Debug mode is turned on by using the –debug argument when running The Fat Controller. Previously this was added by the init script /etc/init.d/fatcontrollerd if the file fatcontroller.debug was found in the current directory. In addition to turning on debug mode, it also looked for the configuration file in the current directory, and not in /etc/

This has now been substantially simplified. All you need to do to enable debug mode is instead of running the Fat Controller with:

sudo /etc/init.d/fatcontrollerd start

Use:

sudo /etc/init.d/fatcontrollerd debug

My plan is that if no major bugs are found in this release then I will re-release it as v1.0.0 as it will finally be everything that I imagined when I first started this project over a year ago.

I’ve still got plenty more ideas for development and I’m eager to hear any other ideas people may have. Please let me know if you have a great idea or suggestion!

Posted in The Fat Controller | Tagged | 11 Comments

The Fat Controller v0.0.4 almost ready – testers wanted!

I’ve just about finished the next version of The Fat Controller, v0.0.4. I’ve completely refactored the way it writes output from sub-processes so it needs some careful testing. It would be great if other people could give it a try and do some testing as well, if you’re interested then just leave a comment and I’ll send you the source. Perhaps version 0.0.4 could be ready in time for Christmas – and what better Christmas present than a new version of The Fat Controller?!

Here are the main new features:

Continuous logging
One of the shortcomings of previous versions was that output from sub-processes was only written to the log file once the process had ended. This is of little important for scripts which run quickly, but this is obviously no good for longer scripts or even when used to daemonise a program.

Now, output is written immediately to the log file which means that The Fat Controller can easily be used to daemonise other programs.   One of the issues I sometimes have with Java applications is that there’s not a simple way to run them as a daemon.   With The Fat Controller this is now possible, as well as provide handling should the Java application terminate for whatever reason.

Logging STDERR
In previous versions, only STDOUT from sub-processes was logged.   In v0.0.4 STDERR is also logged and you can specify either a separate log file for STDOUT and STDERR or simply log both into one file.

If you want to impress and amaze your friends and get your hands on the latest version before everyone else then just leave a comment below and I’ll send you the source and you can get testing!

Posted in The Fat Controller | 16 Comments

The Fat Controller v0.0.3 released!

Finally, after almost five months, version 0.0.3 is ready!

https://sourceforge.net/projects/fat-controller/files/

There are plenty of changes, mostly bug fixes which relate to startup options (not the actual running of The Fat Controller) and a new thread model – fixed interval – which runs processes like cron at fixed, regular intervals between each new process creation.

Here are some highlights from the changelog:

ADDED Fixed interval thread model
The ‘independent thread model’ and ‘dependent thread mode’ start another instance of the target program a specified number of seconds after the previous instance ends. The ‘fixed interval thread model’ starts another instance a specified number of seconds after the previous instance starts, hence the interval beteen new instances is fixed. The interval is specified using the -s,--sleep argument, just as for the other thread models. Note that the interval is respected even if a process returns status 64. Only if a process returns a status of -1 will the interval specified by -e,--sleep-on-error be used.

ADDED Long running instance termination
Using the --proc-run-time-max argument, it is now possible to specify that an instance be terminated if it runs longer than the specified number of seconds. Default behaviour is never to terminate processes unless the main Fat Controller process is requested to shutdown. (Processes terminated by sending SIGTERM). It is advised to specify this argument.

Plans for v0.0.4
I’ve already drawn up a rough list of features and improvements for v0.0.4, the most important being changing the logging of sub-processes. Rather than collecting from STDOUT and writing everything to the log file once a sub-process ends, The Fat Controller will continually write whatever arrives from STDOUT and STDERR to log files. This will make things much better for long running sub-processes.

If you have any ideas or suggestions then please comment or file a report on the Sourceforge Fat Controller tracker page:

https://sourceforge.net/tracker/?group_id=386594

Posted in The Fat Controller | 8 Comments

Fat Controller 0.0.3 almost ready

It’s been ages in development (sorry about that) but the next version of The Fat Controller is almost ready! Apart from really quite a lot of bug fixes the main new feature is called “fixed interval mode” which works more like CRON in that scripts are executed at precise time intervals, unlike measuring the interval from when a script ends. This brings in a whole host of new problems, such as multiple instances, maximum instances, what to do when we’ve reached the maximum number of instances and so on, and yes, there are plenty of new configuration options to address all of this!

After finding quite a few embarrassing bugs in v0.0.2, I’ve decided to spend some time on Quality Assurance, which is what I’m currently doing before releasing v0.0.3. If anyone wants to help out with this then please let me know – any and all help is appreciated!

Posted in The Fat Controller | Tagged | Leave a comment

Javascript: Exception thrown and not caught

I recently started getting this Javascript error message on every page in my application:

Exception thrown and not caught

Interestingly, I only got this message in Internet Explorer (IE7) and only on the first page view after clearing my temporary internet files (cache). If I refreshed the page, the error was gone and only reappeared when I cleared my browser cache – again disappearing after a page refresh.

My suspicion was that some Javascript in the HTML was trying to call a function or a method on an object that was declared in an external Javascript file, crucially before that external file had loaded from the server. This would explain the fact that the error disappeared after a page refresh, as the external file would already be available from the browser cache.

The solution was simple – I wrapped the Javascript that was in the HTML file using jQuery’s $(document).ready() function and the problem was solved.

Interesting note for Wicket users:

My application is built in Wicket and I embed the Javascript into the HTML response using AbstractBehavior.renderHead(IHeaderResponse response). Initially I used response.renderOnDomReadyJavascript(String javascript) which executed the Javascript once the DOM was ready but before the external Javascript dependencies were loaded. My first attempt to fix this was to used response.renderOnLoadJavascript(String javascript) which, as the JavaDoc states, executes the Javascript after the the entire page is loaded. This worked fine, except when the behavior was applied to components rendered in an Ajax response, as of course there was no page load event when the ajax request completed.

My solution was to move back to using response.renderOnDomReadyJavascript(String javascript) and, as stated above, wrap my Javascript in jQuery’s $(document).ready() function.

Posted in Javascript | Tagged , , | Leave a comment

Attach IntelliJ debugger at application startup

Normally when I want to debug a Java appliaction, I run the application and then connect the IntelliJ remote debugger to the JVM. Today I needed to debug the boot sequence of the application which meant I needed the debugger attached right from the start so as to catch the breakpoints at the beginning of the boot sequence, connecting manually would be too late.

In IntelliJ I changed the debugger mode from “attach” to “listen”. It then told me to use these command line arguments for the JVM:

-Xdebug -Xrunjdwp:transport=dt_socket,server=n,address=nick-laptop:5005,onthrow=,suspend=y,onuncaught=

What it doesn’t say is what values to use for the “onthrow” and “onuncaught” options. After a bit of fiddling I got it to work by setting “onuncaught=n” and removing entirely the “onthrow” clause:

-Xdebug -Xrunjdwp:transport=dt_socket,server=n,address=nick-laptop:5005,suspend=y,onuncaught=n

Linux users…

If you’re not using Linux then you can skip this bit, if you are, then this might be useful. I found it annoying to set IntelliJ to listen for incoming debug connections before starting the application, most of the time it’s fine tohave the JVM in listen mode and initiate connections as-and-when from IntelliJ. As a solution I added a couple of aliases to my ~/.bashrc file so that I could swap the behaviour easily:

# For IntelliJ in attach mode
export MAVEN_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"
alias ijattach='export MAVEN_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"'

# For IntelliJ in listen mode
alias ijlisten='export MAVEN_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,server=n,address=nick-laptop:5005,suspend=y,onuncaught=n"'
Posted in Java | Tagged , , , | 2 Comments