News
Features
Guest Essay
Technicalia
Security
Community
Commentary
Home

Creating Integrated High Quality Linux Applications

Avi Alkalay

IBM Linux Impact Team

[email protected]

[email protected]

Copyright © 2002 by Avi Alkalay

Introduction

Linux is becoming more and more popular, and many Software vendors are porting their products from other platforms. This article tries to clarify some issues and give tips on how to create Linux applications highly integrated to the Operating System, security, and ease of use.

The examples apply to Red Hat Linux, and should be compatible with other distributions based on Red Hat (Conectiva, Turbolinux, Caldera, PLD, Mandrake, etc).

User Friendly: Guaranteed Success

The user-friendly concept is misassociated with a good GUI (graphical user interface). In fact, it is much beyond that. In systems like Linux (with more server-like characteristics), the user measures how easy software is mainly in the installation and initial configuration. He can forget how easy it was to install and use a certain product, but he will never forget that a software package has a complex configuration and installation process. A migration or new installation allways will be a nightmare, making the user avoid it.

Embrace the Install-and-Use Paradigm

Imagine you'll install that expensive product your company bought from ACME, and realize you'll have to do the following:

  1. To have a manual that shows the installation process step-by-step. We know that a manual is the last thing the user reads.

  2. Read some README files.

  3. Uncompress huge files on your disk (after downloading them from net or CD), to create the intallation environment.

  4. Read more README files that appeared in the installation environment.

  5. Comprehend that the installation requires you to execute in a special way some provided script (the inconvenient ./install.sh).

  6. Uncomfortably answer some questions that the script asks, such as target directory, user for the installation, etc. To make it worse, this all frequently happens in a terminal that has a misconfigured backspace.

  7. After the installation, configure some environment variables in your profile, like $PATH, $LIBPATH, $ACMEPROGRAM_DATA_DIR, $ACMEPROGRAM_BIN_DIR, etc.

  8. Edit OS files to include the presence of the new product (e.g. /etc/inetd.conf, /etc/inittab)

  9. And the worst: Change security permissions of OS directories and files to let the product run correctly for the appropriate users.

Sound familiar? Who never faced this sad situation, that induces the user to make mistakes? If your product's installation process sound like Uncompress-Copy-Configure-Configure More-Use, like this one, you have a problem, and the user won't like it.

Users like to feel that your Product integrates well with the OS. You should not demand that the OS adapt itself to your Product (changing environment variables, etc). It must let the user Install-and-Use.

The Install-And-Use glory is easily achieved using a 3 ingredients receipe:

  1. Understanding the Four Universal Parts of Any Software

  2. Understanding how they are related to Linux's directory hierarchy

  3. Aggressively use a package system, for process automation and leverage first items. In our case, this is RPM.

We'll discuss here what these are ingredients and how to implement them.

The Four Universal Parts of Any Software

The file set of any application software -- graphical, server-side, commercial, open/free, monolithic, etc. -- always has four parts:

1st: The Software on its own -- the body

The executables, libraries, static-data files, examples, manuals and documentation, etc. Regular users must have read-only access to these files. They are changed only when the system administrator makes an upgrade in this Software.

2nd: Configuration files -- the soul

These are files that define how the software will run, how to use the content, manage security, maximize performance, etc. Without them, the software on its own is usually useless.

Depending on your software, specific privileged users may change these files, to make the software behave as they want.

It is important to provide documentation about the configuration files.

3rd: Content

This is what receives all the user attention. It is what the user delegated to be managed by your product. It is what makes a user throw away your product and use the competitors', if it gets damaged: the tables of a database system, the documents for a text editor, the images and HTML pages of a web-server, the servlets and EJBs of an Application Server, etc.

4th: Logs, dumps, etc.

Server software uses these as access logs, trace files, for problem determination, temporary files, etc. Other types of software also use these files, but it is less common.

It is the last class of file that many times is the biggest problem generator for a system administrator, because their volume can surpass even the content size. Due this fact, it is important for you to establish some methodology or facility for this issue, while you are in design time.

Practical Examples

Let's see how universal this concept is by analyzing some types of software:

Table 1. Universality of Four Parts

App Software on Its Own Configurations Content Logs, Dumps, etc.
Data Base Server Binarys, libraries, documentation. Files that define the directory of the data files. For this type of software, the remaining configurations usually are in special tables inside the database. Table files, index files, etc. This software used to have whole trees under the same directory. And many times they need several filesystems to guarantee performance. Their location in the system is defined by the configuration. For DBs, there are the backup, generated on a daily basis. And the logs are used by the DBA to define indexing strategy. Its location on the system is also defined by the configuration.
Text Processor The same, plus templates, modular file format filters, etc. As user-oriented software, its configuration must be put in each user's $HOME directory, and are files that define standard fonts and tabulation, etc. The documents are generated by the user, and they go some place in his $HOME They show up as temporary files that can be huge. User can define their location with a user-friendly dialog (that saves it in some configuration file)
MP3 generator Same, plus audio modular filters Each user has a configuration file in his $HOME that contains bitrate preferences etc. Similar to Text Editor Similar to Text Editor
Web Server Similar to Data Base Files that define the content directory, network and performance parameters, security, etc. Directories where the webmaster deposits his creativity. Again defined by the configuration. Precious access logs, vital for marketing purposes, that are generated in a location and format defined by configuration.
e-Mail Server Similar to Database and Web-Server Files that define how to access user database, mail routing rules, etc. The precious user mail boxes. Again defined by the configuration. Mail transfer log, virus detection log, etc. Again defined by the configuration.

Note that the "Software on its Own" category contains all your product business logic, which could be useless if you hadn't a configuration to define how to work with a data bundle, provided by the user. Configurations are what connect your product to the user.

We can use a metaphor about a Sculptor (business logic), that needs Bronze (content) and a Theme or Inspiration (configuration) from a Mecenas (user), to produce a beautiful work (content). He makes annotations in his Journal (logs) about his day-by-day activities, to report to his Mecenas (user).

The Importance of Clear Separation Between Four Parts

OK, so let's be more practical. The fact is, if we correctly use the universal parts concept, we greatly improve the quality of our product. We'll do that simply separando, encapsulating each one of these parts in different system directories (having only different files for each part is not sufficient). There is a standard called FHS that defines the Linux directories for each part, and we'll discuss it later.

But now, let's see the value of this separation to the user:

  1. He gains a clear vision of the location of each part, especially his Configurations and Content, and he feels your product is something completely under control. The clarity brings ease of use, security and confidence in your product. And in practice it permits him to manipulate each part independently

  2. It is clear now that, for instance, when backing up, user action is needed only for Configurations and Content (the puritanos will also backup some logs). The user doesn't have to care about the Software on Its Own, because it is safe, original, on the product CD, in his armario.

  3. For upgrades, the new package will overwrite only the business logic, leaving intact the user's precious Configurations and Content. Here it is very important to keep old content and configuration compatible, or to provide some tools to help migration of data.

  4. The logs being kept in a separate filesystem (obviously suggested in your documentation) assures that their exaggerated growth not interfire with the Content, or with the stability of the whole system.

  5. If your software follows some directory standards, the user doesn't have to reconfigure his system or environment to use it. He will simply Install-and-Use.

Let's conduct an exercise with separation, using as an example a system called "MySoftware," in which the business logic is in Example 1 and the configuration is in Example 2.

Example 1. A Shell program refering an external configuration file

#!/bin/sh
#########################################
##
## /usr/bin/MySoftware
##
## Business logic of MyProgram system.
## Do not change anything in this file. All configuration can be
## made on /etc/MySoftware.conf
##
## We'll not support any modifications made here.
##

# Default configuration file 1
CONF=/etc/MySoftware.conf

# Minimal content directories 2
MIN_CONTENT_PATH=/var/www:/var/MySoftware/www

if [ -r "$CONF"]; then
. "$CONF" 3
fi

# All the content I'll serve are the "minimal" plus the ones provided
# by the user in the configuration file $CONF
CONTENT_PATH=$MIN_CONTENT_PATH:$CONF_CONTENT_PATH 4

NOTES:

1 Definition of the configuration file name.

2 Definition of some static parameters.

3 The configuration is readed from an external file, if exists.

4 After reading the configuration file, all content directories -- user's plus product's -- go together in the $CONTENT_PATH that will be used from now on.

Example 2. File containing only the configurations for MySoftware

#######################################
## /etc/MySoftware.conf
##
## Configuration parameters for MySoftware.
## Change as much as you want.
##

# Content directory.
# A ':' separated list of directories for your content.
# The directories /var/www and /var/MySofware are already there, so
# include here your special directories, if any.
CONF_CONTENT_PATH=/var/NewInstance:/var/NewInstance2 *

# Your e-mail address, for notifications.
[email protected] *

# Logs directory
LOG_DIR=/var/log/myInstance
*

* These are user defined parameters.

One Body, Many Souls

When I was a system administrator for IBM e-business Hosting Services, I was fascinated by Apache's flexibility in letting us do things like this:

bash# /usr/sbin/httpd &
bash# /usr/sbin/httpd -f /etc/httpd/dom1.com.br.conf &
bash# /usr/sbin/httpd -f /etc/httpd/dom2.com.br.conf &
bash# /usr/sbin/httpd -f /etc/httpd/dom3.com.br.conf &

If we don't pass any parameter (like the first example), Apache loads its default, hardcoded configuration file from /etc/httpd/conf/httpd.conf. We built other configs, one for each customer, with a completelly different structure, IP address, loaded modules, content directory, passwords, domains, log strategy, etc.

This same concept is used by a text editor on a multiuser desktop (like Linux). When the code is loaded, it looks for a configuration file on the user's $HOME, and depending who invoked it (user A or B), it will appear differently because each user has his own personal configuration.

The obvious conclusion is that the software's body (business logic) is purely and completely oriented by its manipulator's spirit (configuration). But the competitive advantage falls on how easily we switch from one spirit to another, as in Apache's example. It is very useful to provide this to your user. You'll be letting him create intimacy, reliability, and comfort with your product.

We used this approach with many different software applications during my e-business Hosting time, and it was extremely useful for maintenance, etc. In a version migration we had total control over where were each of its parts, and upgraded and downgraded software with no waste of time, with obvious success.

But there were some products that refused to work this way. They had so many hardcoded parameters that we couldn't see what divided the body from its spirit (or other parts). These applications were marked as "bad guys" and discarded and replaced as soon as possible.

We concluded that the "good guys" applications were intuitivelly blessed by their developers' four parts vision. And they made our life easier. In fact, that is when we formulated this theory, which continues to prove itself.

Do you want to deploy bad guy or good guy software?

Linux Directory Hierarchy: Oriented to the Software Parts

Until now, all discussion has been OS independent. On Linux, the Four Software Parts theory is expressed in its directory structure, which is classified and documented in the Filesystem Hierarchy Standard. The FHS is part of the LSB (Linux Standard Base), which is a good thing because all the industry is moving toward it, and it should be on the minds of all distributions. FHS defines in which directories each piece of Apache, Samba, Mozilla, KDE and your software must go. That's reason enough to use it, but there are other reasons as well:

  1. FHS is a standard, and without standards we have chaos.

  2. This is the most basic OS organization, related to access levels and security, where users intuitively find each type of file.

  3. It makes users' lives easier.

This last reason already justifies FHS adoption, so always be guided by the FHS!

FHS Summary

Let's summarize what the FHS has to say about Linux directories:

Linux system directories

/usr/bin
Directory for the executables that are accessed by all users (everybody has this directory in his $PATH). The main files of your software will probably be here. You should never create a subdirectory under this directory.

/bin
Like /usr/bin, but here you'll find only boot process-vital executables that are simple and small. Your software (being high-level) probably doesn't have anything to install here.

/usr/sbin
Like /usr/bin, but contains only the executables that must be accessed by the administrator (root user). Regular users should never have this directory in their $PATH. If your software is a daemon, this is the directory for some of its executables.

/sbin
Like /usr/sbin, but only for the boot process-vital executables, and some that will be accessed by sysadmin for some system maintenance. Commands like fsck (filesystem check), init (father of all processes), ifconfig (network configuration), mount, etc., can be found here. It is the system's most vital directory in many ways.

/usr/lib
Contains dynamic libraries and support static files for the executables at /usr/bin and /usr/sbin. You may create a subdirectory such as /usr/lib/myproduct to contain your helper files or dynamic libraries that will be accessed only by your software without user intervention. A subdirectory here may be used as a container for plugins and extensions.

/lib
Like /usr/lib but contains dynamic libraries and support static files needed in the boot process. You'll never find an executable at /bin or /sbin that needs a library that is outside this directory. Kernel modules (device drivers) are under /lib.

/etc
Contains configuration files. If your software uses several files, put them in a subdirectory such as /etc/myproduct/.

/var
The name comes from "variable", because everything that is under this directory changes a frequently. Often, /var is mounted in a separate high-performance partition. In /var/log logfiles grow. For web content we use /var/www, and so on.

/home
Contains the users' (real human beings') home directories. Your software package should never install files here (during installation). If your business logic requires a special UNIX user (not a human being) to be created, you should assign him a home directory under /var or some other place outside /home. Do not forget that.

/usr/share/doc, /usr/share/man
The "share" word is used because what is under /usr/share is platform independent, and can be shared among several machines thru a network filesystem. Portanto this is the place for manuals, documentations, examples etc.

/usr/local, /opt
These are obsolete folders. When UNIX didn't have a package system (like RPM), sysadmins needed to separate an optional (or local) application from the main OS. These were the directories used for that.

You may think is a bad idea to break your software (as a whole) in many pieces, instead of keeping it all under a self-contained directory. But a package system (RPM) has a database that manages it all for you in a very professional way, taking care of configuration files, directories etc. And if you spread your software using the FHS, beyond the user friendliness, you'll bring an intuitive way for the sysadmin configure it, with better performance and security.

Examples Using the FHS

Now that we know where each part of our software must be installed, lets review the Universal Parts Table applied to the FHS.

Table 2. Same Software, applying FHS

App Software on its Own Configurations Content Logs, Dumps, etc.
Data Base Server /usr/bin/, /usr/lib/, /usr/share/doc/mydb/, /usr/share/doc/mydb/examples/ /etc/mydb/ /var/db/instance1/, /var/db/instance2/, etc /var/db/instance1/transactions/, /var/log/db/access-instance1.log, /var/log/db/access-instance2.log
Text Editor /usr/bin/, /usr/lib/, /usr/lib/myeditor/plugins/, /usr/share/myeditor/templates/, /usr/share/doc/myeditor/ $HOME/.myeditor.conf $HOME/Docs/ $HOME/.myeditor-tmp/
MP3 Generator /usr/bin/, /usr/lib/, /usr/lib/mymp3/plugins/, /usr/share/doc/mymp3/ $HOME/.mymp3.conf $HOME/Music/ $HOME/.mymp3-tmp/
Web Server /usr/sbin/, /usr/bin/, /usr/lib/httpd-modules/, /usr/share/doc/httpd/, /usr/share/doc/httpd/examples/ /etc/httpd/, /etc/httpd/instance1/, /etc/httpd/instance2/ /var/www/, /var/www/instance1/, /var/www/instance2/ /var/logs/httpd/, /var/logs/httpd/instance1/, /var/logs/httpd/instance2/
E-Mail Server /usr/sbin/, /usr/bin/, /usr/lib/, /usr/share/doc/mymail/ /etc/mail/, /etc/mailserver.cf /var/mail/ /var/spool/mailqueue/, /var/logs/mail.log

Friendly Advice: Never use /opt or /usr/local

This is a very polemic subject, and that is why this is the most important section in this document. After almost ten years of UNIX experience I can beat any /usr/local-prone argument. I'll try to organize some ideas here.

It is very important to Linux evolution and popularization (sobretudo in the desktop battlefield), that developers stop using these directories, and start using the FHS. After reading this section, if you still think these directories are good business, please drop me an e-mail.

Products that are entirely installed under one directory use the self-contained approach, which has several problems:

  1. Forces the user to change environment variables like $PATH and $LD_LIBRARY_PATH to use your product easily.

  2. Puts files in non-standard places, complicating system integration and future installation of extensions to your product.

  3. The sysadmin probably didn't prepare disk space in these partitions, generating problems at installation time.

  4. It is an accepted approach only for pure graphical application, without the command line concept. This is why it was well accepted in Windows. But...

  5. ...even using this approach, you can't avoid installing or changing files in standard locations to, for instance, make your icons appear on the user desktop.

These directories exists for historical reasons, and for compatibility with other UNICES, from the era when we didn't have a package management system, so the sysadmin needed to segregate the products to keep control. And the last item shows us that even if you try, you'll not be able to segragate your product. You'll only make it distant and impractical to the user.

You don't have to be afraid of spreding your files according to FHS because RPM will keep an eye on them.

Many developers believe that the "self-contained" approach lets them work with several versions of the same product, for testing purposes or whatever. Yes, feel free to agree with this or any good reason on the planet. But remember that a High Quality Software (or Commercial Grade Software) Product objective is to be practical for the final user, even if inconvenient for developers and testers. Invite yourself to visit an inexperienced user (but potential customer) and watch him installing your product.

If you have a business requirement that the user work with several versions of your Product simultaneously (or have some other reason), make a relocatable package, which is described in the book, Maximum RPM. Be also aware of the implications of using this feature, described in the same book.

Note that distributions like Red Hat and derivatives always use the FHS, instead of /opt or /usr/local. Read what Red Hat says about this subject, and think about it.

The makefiles of an open source program that is portable to other UNICES must have the standard installation in /usr/local for compatibility reasons. But must also give the option, and encourace the packager, to create the package using FHS specifications.

Provide Architecture for Extensions and Plugins

You'll probably let other software vendors plug extensions onto your product. Since you are the author of the initial package, is your responsibility to organize it in such a way that the user can simply install the extension RPM and use it, without forcing him modify any configuration file. Is is again the famous Install-and-Use concept that guarantees ease-of-use.

An extension is nothing more than some files in the proper format (DLLs that implement the API your software defined), put in the right directory (where your software looks for extensions). We see many applications that request the user to change configuration files to "declare" the presence of a new plugin.

The most important thing to consider in your plugin architecture is to not share files between plugins and your software. You should provide an architecture where plugins will fully install and uninstall themselves by simply putting and removing files in specific directories, documented in your program. Good candidates are /usr/lib/myproduct/plugins as the plugins directory, and /etc/myproduct/plugins as the plugins configuration files directory. Your software and plugins must be sufficiently inteligent to know how to find files, especially configurations, in these directories. Using this approach, no post-install procedure is required from the user or from the plugin provider.

Plugins in the Abstract

I would like to close this subject by inviting the reader to consider how any program can be thought of as an extension to the lower level software. In the same way a third party plugin is an extension to your software, your software is also an extension to the OS (lower level). This is where all the Integration (from the title of this document) magic lives. So we can apply all the ease-of-use concepts we discussed before to the plugin architecture design of your software.

Always Provide RPM Packages of Your Software

This is extremely important for many reasons:

  1. Ease-of-use. This is always the primary motivation.

  2. Automation of some tasks that must be done before and after the installation of your software. Which again has ease-of-use implications.

  3. Intelligent managment of configuration files, documentation etc, providing more control in an upgrade.

  4. Managment of interdependencies with other packages and versions, garantindo good functionality.

  5. Provision for distribution of software with your company's digital signature, and integrity checks (MD5) on each file, guaranteeing integrity and reporting unauthorized file modifications.

  6. Provision for tools to let the administrator interact with your graphic installer.

But a good package is not only putting your files in a RPM. The FHS must be followed, configuration and documentation files must be marked as such, and pre- and post-install scripts must be robust, so that thay cannot damage the system (remember that installation processes is done by root).

Thoroughly familiarize yourself with RPM because it can bring much power and convenience to you and your users. There is a lot of documentation available about RPM on the Internet:

Software Package Modularization

You should give users the option to install only the part of your software he wants. Imagine your application has both client and server parts, which have files and libraries in common. You should break your application into three RPMs. For instance, lets say the name of your product is MyDB. You'll provide these packages:

  1. MyDB-common-1.0-3.i386.rpm

  2. MyDB-server-1.0-3.i386.rpm

  3. MyDB-client-1.0-3.i386.rpm

The last two packages depend on the first. If the user is installing a client profile, he will use:

  1. MyDB-common-1.0-3.i386.rpm

  2. MyDB-client-1.0-3.i386.rpm

If he is installing a server profile:

  1. MyDB-common-1.0-3.i386.rpm

  2. MyDB-server-1.0-3.i386.rpm

This approach will help the user save disk space and be aware of how your software is organized.

Security: The Omnipresent Concept

From a very general perspective, security is synonym of maintaining order, conscience. And insecure is everything that runs counter to this idea. So besides open network ports, or weak cryptography (both of which are beyond the scope of this document), applications that require the user to use them only as root, or make him change files in protected places, is considered insecure. We can say the same for apps that fill a filesystem that is vital to the OS.

Many standards have resulted from good practices discussed and developed over a long time and lots of experience. So you should know and use them when you package your software, because they are key for you to achieve a good organization (security) level.

Graphical User Interfaces

Everybody loves graphical interfaces. Many times they make our lives easier, and in this way help to popularize software, because the learning curve becomes shallower. But for everyday use, a command at the console prompt, with many options and a good manual, becomes much more practical, making scripts easy, allowing for remote access, etc. So the suggestion is, whenever is possible, to provide both interfaces: graphical for the beginners, and the powerful command line for the expert.

KDE, GNOME, Java or Motif?

Better then a simple graphical interface is a consistent integrated desktop. And the desktops today in Linuxland are KDE and GNOME. Try always to use one of them, or both.

KDE is the most destacado, offering a truly consistent desktop, flexibility, and an extremely elegant architecture, that employs components and intercomunication, among other features. It is constantly evolving, and is developed in C++. Its applications have an familiar integrated look-and-feel. It is light and mature.

GNOME also uses the integrated desktop concept, but it is far from the maturity and ease-of-use of KDE. On the other hand, it is very well supported by the community, and substantial improvements are appearing.

Motif isn't an integrated desktop. It is a widgets library (buttons, scrollbar, etc.), plus a window-manager. It was born commercially, is mature, and is popular in commercial aplications. But Motif is considered obsolete in light of KDE and GNOME, which integrate the desktop. Motif source code has been opened by the OpenGroup and renamed OpenMotif.

Java is being used more and more for graphical interfaces, specially in server software, where the graphics are only helpers to configuration and administration.

Web Interface: Access from Anywhere

Nowadays every desktop has a browser, and if your product is a server application, the Web Interface is the right choice, because it lets a user administer it from anywhere. But keep in mind the security and organization of your CGIs, because they can become front doors for crackers. Web interface (CGI) is completely different programming paradigm. Try to understand it conceptually first, starting from "how a web-server works", "what is a URL", etc., to avoid compromising your product's security.

Wizards and Graphical Installers

Especially if it is a commercial product, your application must provide a graphical installer. Believe me, they are impressive in a demonstration, and CIOs love them.

More then just overseeing installation, a wizard helps in the initial configuration of your product, collects information such as like activation keys, and shows the developer license.

A wizard should not do more than this:

  1. Ask which modules to install, presented to the user as checkboxes.

  2. Get the necessary information to build an initial configuration (the soul) for the software.

  3. Install the selected modules, which are in fact RPM files. Each checkbox must represent one or more RPMs, because each RPM is a indivisible (atomic) portion of the software.

  4. After RPM(s) are installed, change the configuration (soul) files (marked this way in the RPMs), or create some content, based on the data the user gave to the wizard.

So the wizard hides the RPM installation and writes initial personalization. RPM is still responsable for putting all your software files in the correct places. This role should never fall to your installer. Think that an experienced user (there are a lot of them in the Linux world) should be able to reproduce your Product installation without the graphical help, using only RPM commands. In fact, in big data centers, where people make mass installations, a graphical installer only gets in the way.

RPM provides tools that help your graphical installer interact with it, such as the installation percentage viewer. Documentation for use can be found in the RPM manual (man rpm) and in the Maximum RPM book.

Starting Your Software Automatically on Boot

The way Linux starts (and stops) all its subsystems is very simple and modular. This lets you define initialization order, runlevels, and so on.

From BIOS to Subsystems

Lets review what happens when we boot Linux:

  1. The BIOS or a bootloader (lilo, zlilo, grub, etc) loads the Linux kernel from disk to memory, with some parameters defined in the bootloader configuration. We can see this process watching the dots that appear on the screen. The kernel file stays in the /boot directory, and is accessed only at this moment.

  2. In memory, kernel code starts to run, detecting a series of vital devices, disk partitions, etc.

  3. One of the last things the kernel does is to mount the / (root) filesystem, that obrigatoriamente must contain the /etc, /sbin, /bin and /lib directories.

  4. Immediately thereafter, it calls the program called init (/sbin/init) and passes the control to it.

  5. The init command will read its configuration file (/etc/inittab) which defines the system runlevel and some shell scripts to be run.

  6. These scripts will continue the setup of the system's minimal infrastructure, mounting other filesystems (according to /etc/fstab), activating swap space (virtual memory), etc.

  7. The last step, and most interesting for you, is the execution of a special script called /etc/rc.d/rc, which initializes the subsystems according to a directory structure under /etc/rc.d. The name rc comes from "run commands."

Runlevels

The runlevels mechanism lets Linux intialize itself in different ways. And also lets us change from one profile (runlevel) to another without rebooting.

The default runlevel is defined in /etc/inittab with a line like this:

Example 3. Default runlevel (3, in this case) line in /etc/inittab

id:3:initdefault:

Runlevels are numbers from 0 to 6 and each one of them is used following this standard:

Runlevel Notes
0 Halts the system. Turning to this runlevel, all subsystems are softly deactivated before the shutdown. Don't use it in the initdefault line of /etc/inittab.
1 Single-user mode. Only vital subsystems are initialized because it is used for system maintenance. No user authentication (login) is required in this runlevel. A command line is directly returned to the user.
2 Historical and similar to 3, but without NFS.
3 Used when a system is in full production. Take it as the runlevel where your software will run.
4 Not used. You can define it as you want, but is uncommon to do so.
5 Like 3, plus a graphical login. It is ideal for a desktop workstation. Use 3 if the machine will be used as a server, for security and performance reasons.
6 Like runlevel 0, but after complete stop, the machine is rebooted. Don't use it in the initdefault line of /etc/inittab.

You can switch from one runlevel to another using the telinit command. And you can see the current runlevel and the last one with the runlevel command. See below how we switch from runlevel 3 to 5.

bash# runlevel
N 3
bash# telinit 5
bash# runlevel
3 5
bash#

The Subsystems

Subsystems examples are a web-server, data base server, OS network layer, etc. We'll not consider a user-oriented application (like a text editor) as a subsystem.

Linux provides an elegant and modular way to organize the subsystems initialization. An important fact to think about is subsystems interdependency. For instance, it makes no sense to start a web-server before basic networking subsystem is active.

Subsystems are organized under the /etc/init.d and /etc/rc.d/rcN.d directories:

/etc/init.d

All installed Subsystems put in this directory a control program, which is a script that follows a simple standard described below. This is a simplified listing of this directory:
Example 4. Subsystems installed in /etc/init.d
bash:/etc/init.d# ls -l
-rwxr-xr-x 1 root root 9284 Aug 13 2001 functions
-rwxr-xr-x 1 root root 4984 Sep 5 00:18 halt
-rwxr-xr-x 1 root root 5528 Nov 5 09:44 firewall
-rwxr-xr-x 1 root root 1277 Sep 5 21:09 keytable
-rwxr-xr-x 1 root root 487 Jan 30 2001 killall
-rwxr-xr-x 1 root root 7958 Aug 15 17:20 network
-rwxr-xr-x 1 root root 1490 Sep 5 07:54 ntpd
-rwxr-xr-x 1 root root 2295 Jan 30 2001 rawdevices
-rwxr-xr-x 1 root root 1830 Aug 31 09:29 httpd
-rwxr-xr-x 1 root root 1311 Aug 15 14:18 syslog

/etc/rc.d/rcN.d (N is the runlevel indicator)

These directories must contain only special symbolic links to the scripts in /etc/init.d. This is how it looks:
Example 5. /etc/rc3.d listing
bash:/etc/rc3.d# ls -l
lrwxrwxrwx 1 root root 18 Jan 14 11:59 K92firewall -> ../init.d/firewall
lrwxrwxrwx 1 root root 17 Jan 14 11:59 S10network -> ../init.d/network
lrwxrwxrwx 1 root root 16 Jan 14 11:59 S12syslog -> ../init.d/syslog
lrwxrwxrwx 1 root root 18 Jan 14 11:59 S17keytable -> ../init.d/keytable
lrwxrwxrwx 1 root root 20 Jan 14 11:59 S56rawdevices -> ../init.d/rawdevices
lrwxrwxrwx 1 root root 16 Jan 14 11:59 S56xinetd -> ../init.d/xinetd
lrwxrwxrwx 1 root root 18 Jan 14 11:59 S75httpd -> ../init.d/httpd
lrwxrwxrwx 1 root root 11 Jan 13 21:45 S99local -> ../rc.local


Note that all link names have a prefix starting with letter K (from Kill, to deactivate) or S (from Start, to activate), and a 2 digit number that defines the boot activation priority. In our example we have HTTPd (priority 75) starting after the Network (priority 10) subsystem. And the Firewalling subsystem will be deactivated (K) in this runlevel.

So, to make your software start automatically in the boot process, it must be a subsystem. We'll see how to do this in the following section.

Turning Your Software Into a Subsystem

Your software's files will spread through the filesystems, but you'll want to provide a simple and consistent interface to let the user at least start and stop it. Subsystems architecture promotes this ease-of-use, also providing a way for it to be automatically started on system initialization. You just have to create your /etc/init.d script following a standard to make it functional.

Example 6. Skeleton of a Subsystem control program in /etc/init.d

The mysystem subsystem methods you implemented will be called by users with a service command such as this example:

Example 7. service command usage

bash# service mysystem start
Starting MySystem: [ OK ]
bash# service mysystem status
Subsysten MySystem is active with pid 1234
bash# service mysystem reload
Reloading MySystem: [ OK ]
bash# service mysystem stop
Stopping MySystem: [ OK ]
bash#

You don't have to worry about managing the symbolic links in /etc/rc.d/rcN.d. The chkconfig command does that for you, based on the control comments defined in the begining of your script.

Example 8. Using the chkconfig command

bash# chkconfig --add mysystem
bash# chkconfig --del mysystem

Read the chkconfig manual page to see what more it can do for you.

Packaging Your Boot Script

When you create the RPM, put your Subsystem script in /etc/init.d and do not include any /etc/rc.d/rcN.d link, because it is a user decision to make your subsystem automatic or not. If you include it and the user makes any change, the RPM file inventory will become inconsistent.

The symbolic links must be created and removed dynamically by the post-installation and pre-uninstallation process of your package, using the chkconfig command. This aproach guaratees 100% package and filesystem consistency.

Appendices

A. Red Hat, About the Filesystem Structure

This text was taken from The Official Red Hat Linux Reference Guide

Why Share a Common Structure?

An operating system's filesystem structure is its most basic level of organization. Almost all of the ways an operating system interacts with its users, applications, and security model are dependent upon the way it stores its files on a primary storage device (normally a hard disk drive). It is crucial for a variety of reasons that users, as well as programs at the time of installation and beyond, be able to refer to a common guideline to know where to read and write their binary, configuration, log, and other necessary files.

A filesystem can be seen in terms of two different logical categories of files:

  1. Shareable vs. unshareable files

  2. Variable vs. static files

Shareable files are those that can be accessed by various hosts; unshareable files are not available to any other hosts. Variable files can change at any time without system administrator intervention (whether active or passive); static files, such as documentation and binaries, do not change without an action from the system administrator or an agent that the system administrator has placed in motion to accomplish that task.

The reason for looking at files in this way has to do with the type of permissions given to the directory that holds them. The way in which the operating system and its users need to utilize the files determines the directory where those files should be placed, whether the directory is mounted read-only or read-write, and the level of access allowed on each file. The top level of this organization (/ directory)is crucial, as the access to the underlying directories can be restricted or security problems may manifest themselves if the top level is left disorganized (security=organization) or without a widely-utilized structure.

However, simply having a structure does not mean very much unless it is a standard. Competing structures can actually cause more problems than they fix. Because of this, Red Hat has chosen the the most widely-used filesystem structure and extended it only slightly to accommodate special files used within Red Hat Linux.

B. About this Document

This document must be distributed under the terms of GNU Free Documentation License, which makes it sufficiently free. Everybody in invited to contribute to its content and ideas.

Copyright 2002, Avi Alkalay.

The original version of this document can be found online at http://avi.alkalay.net/linux/docs/HighQuality/.

It was written originally in Brazilian Portuguese, and then translated to English. SGML and the more-than-incredible DocBook was used, which made possible this document being distributed in other formats, as found on the website.

It got ready (Portuguese+English) in mid-March 2002. Everything changed after this time period is cosmetics.

I wrote it to help commercial companies and OpenSource developers make plug-and-play, easy-to-use software for Linux, and this way improve Linux usability and popularity.

All concepts (from a high level perspective) described here, can be used in any UNIX flavor, or even other OSes, like Windows. Maybe some day I'll write one of these for Windows....or Mac....

Each week, a member of the Linux community, sometimes famous, sometimes not, discusses an issue of interest in a Guest Essay. If you'd like to contribute, send your essay to us here.

Posted 24 March 2002