Chris Baus

What is a Linux distribution?

This is my first question and answer style blog entry in the same vein of analyst Stephen O'grady. With the recent Oracle Linux announcement there seems to be a significant amount of confusion in both the technological and financial press about what a Linux distribution is and the roll Linux distribution vendors play, so I thought this would be a good time to review. It seems sort of schizophrenic to answer your own questions, but what the hey.

Who are you and why should I care what you have to say?

I am software designer and long time Linux user and advocate. I started using Linux in 1994 when you had to recompile the kernel to install the latest sound blaster CD ROM drive. I got my first Slackware distribution out of the back cover of Running Linux. I currently develop software for Linux and maintain Linux production servers in the financial domain. Other than that, I'm just a developer with an opinion and interest in the business of OpenSource. As with any crack pot commentary on the web you'll have to judge its worth for yourself.

What is a Linux distribution?

Before I answer that, I should take a step back and answer a simpler question. What is Linux? There are two answers to this question. A hard core geek such as myself, might answer that Linux is the core kernel software which acts as a layer between the computer hardware and the programs we use to get or work done everyday. It is was started and maintained by Linus Torvolds, from which its name is derived. Thousands of programmers have contributed to development of the Linux kernel.

A Linux System includes the kernel and supporting software used to operate a computer. Linux differs from commercial operating systems because the software is distributed with an Open Source license and the source code is available free of charge.

What is source code?

Source code is the notation used by programmers to create software products. It is understandable and modifiable by other programmers. With access to the source code, programmers can make modifications to the software. To run the software the source code must be converted to binary code which is readable by the computer by a process called compilation. Traditionally software companies such as Microsoft and Oracle have protected their intellectual property by distributing only the computer readable binary code for their software. It is more difficult, if not impossible, for programmers to make modifications directly to the binary code.

What is Open Source software?

Open Source software is software for which the source code is available for other third party programmers to use and modify. The Open Source Initiative has defined specific criteria for deciding if software should be deamed Open Source. Generally users may legally use the code in anyway they like, with some restrictions depending on the license used by the developer of the software.

Open Source licenses could be the topic of a seperate blog entry, but in their most restrictive form of the Gnu Public License (GPL) users are allowed to use and modify the software with the limitation that if they redistribute the software in binary form, they must also make their changes available to other developers. In less restrictive forms, such as the Berkley Software Distribution license (BSD), the code can be used anyway as long as the copyright information is included with software which makes use of the code. Neither license prevents third parties from profiting from the sale of the code. BSD allows proprietary software to be created from the free software.

It sounds like programmers are working for free by developing Open Source software. Why would they do that?

There are many reasons for working on Open Source software that aren't directly related financial gains: fun, credibility, and influence to name a few. But much of the high quality Open Source software is built by developers who are paid to do so. Firms often cooperate on OpenSource software for their collective good.

For instance many large public web sites have the same scalability problems, but individually they might not have the resources to build the software solutions on their own. Instead they Open Source their work with hopes of collaborating with other programmers and companies who are trying to solve the same problem. Danga and LiveJournal built their business around this model. The software itself isn't the product. The LiveJournal service is.

So programmers provide code that is freely available on the internet, why do I need Red Hat? Why can't I download the software run it myself?

When you say Red Hat, I'll assume you mean any Linux vendor. I can now go back and answer the original question, what is a Linux distribution? As I mentioned before Linux is the core kernel software that is a layer between the hardware and applications. An entire system that can be installed and used on computer requires 1000 of smaller or often even large supporting components. In the early days of Linux it is was extremely difficult to just install and run the system. It required downloading source code from various sources on the internet, compiling it to binary form, and then configuring it all to work together. This process could take weeks even with an experienced system administrator. This is where the distributions come into play. They offer the following services:

The end result is a Linux distribution, which providers such as Red Hat sell for an annual license fee. The distribution vendors often provide their own software (typically installers and package management tools), and those tools are commonly available as Open Source for others to use. They also brand the final system with their own name and logos. I'll explain why this is important later on.

So Red Hat basically collects free software from the internet, adds some more free software to it and sells it?

In one word, yes. But what they really sell is support. When any one of those thousands of components isn't working correctly, the vendor is somebody who the user can hold responsible to fix the problem.

That sounds like a really great business model. But since the software is free why don't other companies download the distribution software and provide their own support?

That's a great question. And here's where the system branding comes into play. If I was trying to get into the distribution business, I couldn't redistribute Red Hat Linux freely and provide my own support. The trick is the name Red Hat and the Red Hat logos are copyrighted by Red Hat, and their license prevents me from redistributing software with their brand.

But couldn't you just remove all the branding and replace it with your own, and distribute that. Say Baus Linux?

Baus Linux, huh? That's got a ominous Third Reich ring to it, but you are really getting to the core issues. Yes I could do just that. In fact that's exactly what the fine folks over at CentOS have done. They have replaced all traces of the Red Hat name in Red Hat Enterprise Linux with CentOS. Any software that works with Red Hat Enterprise Linux, works just fine with CentOS. For all practical purposes the systems are identical.

In the good ole days (in the 90s) Red Hat used to not place any restrictions on the distribution of their brand. You could go to cheapbytes.com and buy a CD that was byte for byte identical to the official Red Hat Linux for approximately $2.99. The only difference was you couldn't get support from Red Hat. If you wanted support you had to buy it separately. That changed after Red Hat Enterprise Linux was released, and Red Hat prevented third parties from using the Red Hat name. That's when CentOS was started.

Oracle is using the same model as CentOS for their Unbreakable Linux strategy. They are taking the Red Hat distribution, removing the Red Hat copyrighted material, and redistributing the software with their own support.

Isn't Larry Ellison evil for stealing Red Hat's code?

Larry isn't really known for being a nice guy, but Red Hat made a business out of distributing the work of other people. That's what made their business model so compelling to investors. Larry's just taking it one step further. He isn't interested in supporting the overhead of creating Linux software, if he can avoid it. He just wants to provide support, which is what customers are paying for anyway. Support is the foundation of most Open Source businesses models. It is pretty difficult to fault Oracle for doing something other Open Source businesses have been doing for years.

Which Linux Distribution should I use?

That's a difficult question to answer. I've been focusing on Red Hat and Red Hat derived distributions, but there are a lot of other options out there. If you can handle your own support, and want a solid server operating system, I recommend CentOS. Ubuntu is popular if you want a Desktop OS. If you are more comfortable with thirdparty support Red Hat has a very good reputation. If you are an Oracle user and you are looking for one stop shopping, Oracle now has answer for you as well.