Chris Baus

Unix is a leaky abstraction: Disk I/O in a non-blocking server

The Law of Leaky Abstractions, is one of Joel Spolsky's most cited articles. There are few capable of maintaining a large system without abstracting the underlying details. Abstractions are vital to the way we developers think, but as Joel points out, abstractions can go awry and "leak" details of the underlying system.

Operating Systems are one big abstraction for hardware. When operating systems were a new fangled thing, Ken Thomson and Denis Ritchie, the fathers of UNIX, came up with a novel concept that is the core of the UNIX architecture and philosophy: everything is a file. That was a leap of abstraction faith that made computing history.

It is surprising how well that abstraction has served us. For many client applications, the file metaphor generally works, and it is probably one of the reasons UNIX like OSes have persisted even though their demise has been predicted uncountable times over the years.

With that said, there are some pretty serious flaws with the abstraction. It didn't take long to realize that a printer was a bit different than a tape drive, and a tape drive was different than and a terminal. Programmers needed to get control of the underlying hardware to do interesting things. Enter ioctl(). ioctl() is a machine gun to the UNIX file metaphor. Using the function is like pulling the trigger, opening fire, and blowing deep holes in the abstraction. Rarely good things come from taking that tool into your hand.

For my work in SwitchFlow, I only care about two types of files: disk files and network files. Those are so common that you might think the abstraction would hold up. It does if you don't care about performance. These days all the cool kids are writting servers which use non-blocking network I/O to handle the masses of traffic that sites like livejournal.com receive. Being the hipster hacker wannabe that I am, SwitchFlow is a non-blocking framework. Unfortunately no Unix like OS that I know of supports non-blocking disk I/O. The abstraction leaks. A file is not a file is not a file.

Non-blocking servers typically use a single thread per processor or processor core. The OS handles concurrency, and only returns control to the application when data is ready to read or a network file is ready to write. The advantage of this architecture is that it eliminates unnecessary threading overhead. Performing blocking I/O in the main server loop is a big no no as it blows any performance advantage from using a non-blocking architecture. The only viable option is to create a hybrid blocking/non-blocking monster, where one or more processes handling blocking disk I/O and the main server process handles network I/O. Squid actually provides two different disk I/O mechanisms to solve this. One is process based, and the other is pthread based. Also my FeedFlow crawler uses this to write feeds to temporary storage.

I point all this out, because it is something to keep in mind if you are considering developing a non-blocking server.

The Windows kernel handles asynchronous I/O more consistently. Almost all I/O operations can be done "overlapped." This is quite different than Unix in that I/O is written to application buffers in seperate threads controlled by the kernel, but it has been proven to work. Dave Cutler knew what he was doing.