Cost of change, reprise

This is a reprise of another article I wrote, The Cost of Change.

Last weekend I made a performance enhancement to the proxy. Generally I tend to avoid changing code to improve performance unless I know for sure it is a problem. But this code in particular was bugging my engineering instincts and I decided to "fix it." The old code had a complexity of O(n), and the new code has a constant complexity and the constant is small. Basically I replaced a loop with some pointer arithmetic. In my experience these sort of algorithmic changes have far greater impact on performance than low level optimizations like loop unrolling or function call optimizations.

But complexity theory is not the point of this article. I broke the code and spent over 2 hours fixing it. That's 2 hours I could have spent on new features, documentation, or drinking beer, which lead me to the follow truisms:

Software developers always underestimate the cost of changing working code
Adhering to broken window theory is ambiguous at best
All non-trivial software has cruft

Developers suffer significantly from Not Invented Here (NIH) syndrome, and I don't exclude myself. Code is often complex by nature. If you dropped me into the middle of some of John Carmack's physics engine code, I probably wouldn't understand any of it. I might then assume that it is ugly or bad, and be tempted to "fix" it by rewriting it. But the truth is I just wouldn't understand the problem space, and would need to derive it myself to understand it (if I could).

Code is the end result of solving a problem. The means of understanding and solving the problem are only vaguely expressed by the code itself. Much of the original thought has been stripped away, and often forgotten. By rewriting code you will understand the original problem, but the end result might not be better code. Other developers will feel the same way about your code as you did about the original code.

This is why I am hesitant to endorse the broken window theory of software, because, in a developers' eyes, code they didn't write is broken. The art of software development is differentiating the windows which are truly broken, from those which are merely dirty. I'm not claiming that no code should be rewritten, but you should be ready to justify rewriting debugged production code, and accept the associated costs.

In my career I have been fortunate to work on, or at least have access to, very large production code bases, and with Open Source software all developers have the same opportunity. In my experience these code bases all had some ugly code. Non-trivial projects take years to write. Not only do techniques change over time, but the understanding of the problem, and market pressures also deviate over time. So what might have seemed like a good idea 5 years ago in a project like IIS, Word, or the Linux kernel, might not seem as appropriate today. Although, again, that doesn't mean the code is truly broken.

While change is inevitable, change is expensive. The implications of change should not be taken lightly. When applying broken window theory to software, the developer must first determine if the window is truly broken, dirty, or simply an intricate plate of stained glass which maybe difficult to replicate or improve upon, even if it isn't perfect.