Chris Baus

The SwitchFlow C++ HTTP Proxy: A Postmortem

SwitchFlow is a non-blocking HTTP Reverse Proxy I wrote in C++ and shelved many years ago. I recently spent a couple days getting the code up and running on Ubuntu, and once I found a clean commit (if you are considering mothballing one of your projects, do yourself a favor and don’t leave it in the middle of an undocumented refactor) it wasn’t too hard to get it to compile. I fixed a 7 year old bug while I was at it.

I was reminded of the project while reviewing Domink Honnef’s notes on Go development and read that Brad Fitzptrick had added a buffer pool to Go’s HTTP implementation. Within a couple days, CloudFlare also had an article on buffer pooling in Go. I think Go will eventually dominate the development of these types of servers (more on that in a minute).

One of my goals was to optimize memory usage not only for performance, but for reliability. It is difficult (or impossible) to recover from memory allocation failures while handling thousands of open connections. SwitchFlow pre-allocates the buffers used for pooling headers at startup, to allow administrators to set the number of connections required and guarantee that the proxy wouldn’t run out of memory at load. In other words, I wanted the server to fail predictably. The assumption was the proxy would basically be the only server running on the machine or VM.

After becoming aware of Nginx, it put a damper on my enthusiasm for SwitchFlow. Although I was probably overly negative about my project’s prospects, even in its early state, Nginx was far beyond the capabilities of SwitchFlow. Nginx is an excellent piece of software, which I have used in production for years with good results, and I’m not surprised by its success.

I view web application design as a processing pipeline rather than a monolithic application mated to a database. Reverse proxies fit neatly into such an architecture, but can significantly increase latency. To depend on a reverse proxy it must be fast and reliable or the downside of performance will outweigh the benefits. I read Dan Kegal’s highly influential The C10K problem, and felt that a reverse proxy was the right venue to experiment with some of the ideas Dan had put forth.

In the end, SwitchFlow provided insight into HTTP, event driven applications, and Linux server development which all influenced my perception on application design. I also gained a better appreciation of the complexity of HTTP servers in general.

C++ benefits from being time proven, ubiquitous, and the ability to create native executables. Given restraint, it isn’t a bad language to work in for these types of projects. My biggest gripe with C++ is that it is often a source of complex designs and inconsistent styles. While I believe my reasons for not using exceptions are justified for a project like SwitchFlow, as soon as another developer joins the project, he or she will inevitably disagree, and instead of focusing on the task at hand, we will debating coding style, which is one of the reasons I’ve been drawn to languages like Python and now Go.

Thoughts on Go

When developing an event driven server, state management can be tedious and error prone. This issue is language independent, and similar problems exist with common event driven frameworks in C, C++, JavaScript, and Python.

It will be difficult for any abstraction to beat the performance of an event driven server written in C, but I think Go will come close, with a model that will enable much larger and complex applications. Go maps very closely to my C++ style for server development (interface driven, no exceptions, and light use of generics) with the added benefits of garbage collection and lightweight user threads (a la Erlang).

While I was skeptical of garbage collection for performance critical applications such as reverse proxies, I’m starting to change my mind after reading some of the results of memory pooling. Pooling in Go maps closely to what C and C++ programmers have done for years when they wanted to override the effects of the generic OS allocator. Most applications which performance sensitive will eventually turn to memory pooling or other manual memory management techniques, but I now think the results will be similar in garbage and non-garbage collected languages if memory pooling is deployed in critical parts of an application – for instance HTTP header buffering.

Regarding Go’s goroutines: I believe the stack is the right abstraction for maintaining the state of a connection from the application programmer’s perspective. Lightweight user mode threads, like goroutines, which can be context switched automatically when performing I/O operations, can provide the best of both worlds – a synchronous programming model with the performance of event driven servers. In my view, the primary drawback is that stack space must be dynamically allocated and freed from the heap as goroutines are started, stopped, and grow. It can difficult to pre-calculate the worst case scenario of stack usage, so an application will need to prepared to handle the failure to commit more stack space to the thread (or goroutine).

But for most applications this is probably the right trade-off, and eventually I think we will see a new class of proxy applications. In short, if I was start SwitchFlow today, I would seriously consider Go.