Total Pageviews

Search: This Blog, Linked From Here, The Web, My fav sites, My Blogroll

20 November 2009

Stand-alone (out-of-network) UNIX administration: back to tradition

Stand-alone UNIX systems administration related issues are:
  • System startup and shutdown
  • User and group accounts management
  • System resources management
  • Filesystems
  • System quotas
  • System security
  • Backup and restoration of the system
  • Automating routine tasks
  • Printing and spooling system
  • Terminals and modem handling
  • Accounting
  • System performance tuning
  • System customization — kernel reconfiguration

UNIX — Introductory Notes

UNIX Operating System
UNIX is a popular time-sharing operating system originally intended for:
  • program development and
  • document preparation
but later widely accepted for a number of implementations. UNIX is today’s most ubiquitous multi-user operating system, with no indication of any diminishment in the near future. Today, when a period of several years represents the lifetime of many successful IT products, UNIX is still considered the most stable and the most secure operating system on the market, four decades after its appearance. Of course, during 40 years of existence UNIX has changed a great deal, adapting to new requirements; it is hard to compare today’s modern UNIX flavors with initial (now obsolete) UNIX versions.
In fact, these changes and adaptations are unique to the UNIX operating system; no other operating system has so successfully evolved, time and again, to meet modern needs.
The concept and basic design of UNIX deserve the credit for this remarkable longevity, as they provide the necessary flexibility for the permanent changes required to make UNIX suitable for many new applications. UNIX, like any other operating system, is an integrated collection of programs that act as links between the computer system and its users, providing three primary functions:
  1. Creating and managing a filesystem (sets of files stored in hierarchical-structured directories)
  2. Running programs
  3. Using system devices attached to the computer
UNIX was written in the C computer language, with careful isolation and confinement of machine-dependent routines, so that it might be easily ported to different computer systems. As a result, versions of UNIX were available for personal computers, workstations, minicomputers, mainframes, and supercomputers. Moreover, many turnkey systems simply use UNIX to support one or more applications. The users of these specialized systems generally interact with the application, but not UNIX itself. It is very possible that many users of UNIX systems do not actually know they are using UNIX, because their view of the system is restricted to the application running on top of the operating system. UNIX has also found its
way and gained popularity in the embedded world, which means, like the turnkey approach, UNIX is hidden from the user community. The embedded world contains a plethora of devices like cameras, controllers, handheld devices, and just about anything else that supports a computer processor, and UNIX can be used to provide a scalable, flexible system that can expand as the device’s capabilities improve over time.

It is somewhat curious to note that portability was not a design objective during UNIX development; rather, it came as a consequence of coding the system in a higher-level language.
Upon realizing the importance of portability, the designers of UNIX confined hardware-dependent code to a few modules within the kernel (coded in assembler) in order to facilitate porting.

The kernel is the “core” of the UNIX operating system. It provides services such as a
Typically, the kernel interacts directly with the underlying hardware; therefore, it must be adapted to the unique machine architecture. However, there were some implementations of UNIX in which the kernel interacted with another (layered systems) underlying system that in turn controlled the hardware. The kernel keeps track of
  • who is logged in
  • the locations of all files
  • it also accepts and enables instruction executions received from the shell as the output of interpreted commands
  • provides a limited number (typically between 60 and 200) of direct entry points through which an active process can obtain services from the kernel. These direct entry points are system calls (also known as UNIX internals).
    The actual machine instructions required to invoke a system call, along with the method used to pass arguments and results between the process and the kernel, vary from machine to machine.

The machine-dependent parts of the kernel were cleverly isolated from the main kernel code and were relatively easy to construct once their purpose had been defined. The machine- dependent parts of the kernel include:
  • Low-level system initialization and bootstrap
  • Fault, trap, interrupt, and exception handling
  • Memory management: hardware address translation
  • Low-level kernel/user mode process context switching
  • I/O device drivers and device initialization code
The rest of the UNIX kernel is extremely transportable and is largely made up of the system call interface from which application programs request services.

An early implementation (for an actual view compare with Linux code here.) of the UNIX kernel consisted of some 10,000 lines of C code() and approximately 1000 lines of assembler code. These figures represent some 5 to 10% of the total UNIX code. When the original assembler version was recoded in C, the size and execution time of the kernel increased by some 30%. UNIX designers reasoned that the benefits of coding the system in a higher-level language far outweighed the resulting performance drawback. These benefits included
  • portability
  • higher programmer productivity
  • ease of maintenance, and
  • the ability to use complex algorithms to provide more sophisticated functions. Some of these algorithms could hardly have been contemplated if they were to be coded in assembly language.
UNIX supports multiple users on suitable installations with efficient memory-management and the appropriate communication interfaces. In addition to local users, log-in access and file transfer between UNIX hosts are also granted to remote users in the network environment.

Virtually all aspects of device independence were implemented in UNIX. Files and I/O devices are treated in a uniform way, by means of the same set of applicable system calls. As a result, I/O redirection and stream-level I/O are fully supported at both the command-language and system-call levels.

The basic UNIX philosophy, to process and treat different requests and objects in a uniform and relatively simple way, is probably the key to its long life. In a fast-changing environment in which high-tech products become obsolete after a few years, UNIX is still in full operational stage, four decades after its introduction.
UNIX owes much of its longevity to its integration of useful building blocks that are combinable according to current needs and preferences for the creation of more complex tools.
These basic UNIX blocks are usually simple, and they are designed to accomplish a single function well.
Numerous UNIX utilities, called filters, can be combined in remarkably flexible ways by using the facilities provided by I/O redirection and pipes. This simple, building-block approach is obviously more convenient than the alternative of providing complex utilities that are often difficult to customize, and that are frequently incompatible with other utilities.
UNIX’s hierarchical filesystem helps facilitate the sharing and cooperation among users that is so desirable in program-development environment. A UNIX filesystem (or filesystem, as it has become known) spans volume boundaries, virtually eliminating the need for volume awareness among its users. This is especially convenient in time-sharing systems and in a network environment.

The major features of UNIX can be summarized as:
  • Portability
  • Multi-user operation
  • Device independence
  • Tools and tool-building utilities
  • Hierarchical filesystem

User ’s View of UNIX
UNIX users interact with the system through a command-language interpreter called the
shell. A shell is actually what the user sees of the system; the rest of the operating system
is essentially hidden from the user’s eyes. A UNIX shell (or shells, because there are different
command-interpreters) is also a programming language suitable for the construction of versatile and powerful command files called shell scripts. The UNIX shell is written in the same way as any user process, as opposed to being built into the kernel. When a user logs into the system, a copy of the corresponding shell is invoked to handle interactions with the related user. Although the shell is the standard system interface, it is possible to invoke any user- specific process to serve in place of the shell for any specific user. This allows application-specific interfaces to coexist with the shell, and thus provide quite different views and working environments for users of the same system.

All programs invoked within the shell start out with three predefined files, specified by
corresponding file descriptors. By default the three files are:
  1. Standard input(also known as stdin fd=0) — normally assigned to the terminal (console) keyboard
  2. Standard output(also known as stdout fd=1) — normally assigned to the terminal (console) display
  3. Error output(also known as stderr fd=2) — normally assigned to the terminal (console) display
The shell fully supports:
  • Redirection — Since I/O devices and files are treated the same way in UNIX, the shell treats the two notions as files. From the user’s viewpoint, it is easy to redefine file descriptors for any program, and in that way replace attached standard input and output files; this is known as redirection.
  • Pipes — The standard output of one program can be used as standard input in another program by means of pipes. Several programs can be connected via pipes to form a pipeline. Redirection and piping are used to make UNIX utilities called filters, which are used to perform complex compound functions.
  • Concurrent execution of the user programs — Users may indicate their intention to invoke several programs concurrently by placing their execution in the “background” (as opposed to the single “foreground” program that requires full control of the display). This mode of operation allows users to perform unrelated work while potentially lengthy operations are being performed in the background on their behalf.
Since UNIX was primarily intended for program development, it offers several
Other useful program development facilities of UNIX include
  • a general-purpose macro-processor, M4, that is language-independent and
  • the MAKE program which controls creation of other large programs. MAKE uses a control file (or description file) called MAKEFILE, which specifies source file dependencies among the constituent modules of a program. It identifies modules that are possibly out of date (by checking the last program update), recompiles them, and links them into a new executable program.
A much more elaborate system for large programming projects, called Source Code Control System — SCCS, was also available under UNIX.
Although SCCS was designed to assist production of complex programs, it can also be used to manage any collection of text files.
SCCS basically functions as a well-managed library of major and minor revisions of program modules. It keeps track of all changes, the identity of the programmers, and other information. It provides utilities for rolling back to any previous version, displaying complete or partial history of the changes made to a module, validation of modules, and the like.
A complex implementation of SCCS evolved into a simpler version named Revision Control System — RCS, which is more suitable to manage text files. RCS provides most of the SCCS functionality in a simpler and more user friendly way.
Users generally have restricted access to the UNIX filesystem; however, they are fully authorized in their home directories, where they can create their own subdirectories and files. This restricted-access approach is necessary to protect the system from intended and unintended corruption, while still allowing users to have full control over their own programs.

Filesystem protection in UNIX is accomplished by assigning ownership for each file and directory that is created. At creation:
  • the access modes for the three access classes (user-owner, group-owner, and others) are also specified.
  • Within each access class, three separate permissions are specified: for reading, writing, and execution of the file.
Since everything in UNIX is a file (or is file-like), this simple protection scheme is widely implemented throughout the whole operating system, making UNIX security and protection very efficient.

Finally, UNIX is extremely well suited for networking. One of the reasons for UNIX’s
enormous popularity and wide implementation lies in its inherent network-related characteristics. UNIX facilitates most network functions in such a way that it can appear the network has been designed expressly for the UNIX architecture. The truth is that UNIX
and modern networks have been developed independently, with UNIX preceding modern network architecture by a decade. The reason UNIX handles networking so well is simple:
UNIX’s flexible internal organization and structure allow an almost perfect union between
the UNIX and network environments.

The History of UNIX
Ken Thompson (later joined by Dennis Ritchie) wrote the first version of UNIX at Bell Labs
in the late 1960s. Everything started with MULTICS (MULTiplexed Information and Com-
puting System)
, at that time the joint venture project between GE, AT&T Bell Laboratories,
and MIT. The next phase was the project UNICS (UNiplex Information and Computing
, which was created by some of the people from the MULTICS project (Ken Thompson,
Dennis Ritchie, and Rudd Canaday). UNICS was an assembly language, single-user system
for the DEC PDP-7, which at that time was the most popular minicomputer. Soon the system
had been enhanced to support two users. The name UNICS was later changed to UNIX.

After a major rewriting in C and porting to the DEC PDP-11 family of computers, UNIX was made available to users outside of AT&T. At the time, AT&T was banned from selling computing equipment by the U.S. antitrust law, and so was forced to release UNIX practically for free. Favorable licenses for educational institutions were instrumental in the adoption of UNIX by many universities. Soon the mutual benefits for both the academic users and UNIX itself became obvious. The leader was the University of Berkeley, which adopted UNIX and tailored it significantly. UNIX also became commercially available from AT&T, together with several other variants of the system provided by other vendors. Two versions of UNIX emerged as the main UNIX platforms, with a number of “flavors” between them.

Berkeley Standard Distribution — BSD UNIX
BSD originated at the University of Berkeley in California and is also known as Berkeley UNIX. Since the 1970’s more BSD-based UNIX releases have been derived from version 4.3 BSD, which for a long time was a dominant version in the university and engineering communities. At the same time, the even older version of 4.2 BSD UNIX is still in use in some commercial implementations. The evolution of BSD is illustrated in Figure 1.1.

Sunsoft (later Sun Microsystems) was most successful at bringing UNIX into the commercial world with its SunOS, which was originally based on SVR4 UNIX, but with many incorporated improvements of BSD. SunOS 4.1.x (mostly referred to only as SunOS) was the best-known representative of the mostly BSD UNIX. The word “mostly” indicates a number of SunOS features that did not originate in the Berkeley version of UNIX. SunOS also introduced many new features (NIS, NFS, etc) that later became overall standards in the UNIX community. In the 1990s, Sun Microsystems changed this very successful UNIX version with the next generation version SunOS 5.x, better known as Solaris. The new version presented a significant shift from BSD UNIX toward System V UNIX. SunOS continues to exist thanks to many operating commercial installations. It survived “Year 2000 syndrome” and still is supported by Sun Microsystems.

System V or ATT UNIX
System V was derived from an early version of System III developed at AT&T Bell Labs, which is why it is also known as ATT UNIX. For a long time, the best-known versions were Release 3 — SVR3.x and Release 4 — SVR4.x. SVR4 attempted to merge older UNIX versions (SVR3 and 4.2 BSD) into a new more powerful UNIX system; the attempt was not a complete success, although its overall contribution has been significant. Certain steps in the development of System V UNIX during this period are illustrated in Figure 1.2.

Later on, many vendors accepted System V UNIX as a base for their own, vendor-specific UNIX flavors, like:
However, it is not fair to classify all of these vendor-specific UNIX flavors as the System V UNIX. Such a statement sounds quite biased. Each vendor-specific flavor includes elements from both main UNIX platforms, so we can talk about mostly BSD, or mostly ATT UNIX flavors. It is even better to talk about BSD or ATT implementations in some segments of vendor-specific UNIX flavors.

In the 1980s Richard Stallman started development of a C compiler for UNIX. He then started the Free Software Foundation — FSF, also known as GNU (GNU stands for “Gnu is Not Unix”). FSF just as it did when it started, manages many free pieces of UNIX-related software, such as GNU C compiler (GCC) and emacs.

UNIX development in the last decade has been characterized by many vendor-specific UNIX flavors on the market. It is difficult to consider them as part of two main UNIX platforms. Each vendor tried to take the best from each of the main UNIX platforms to make a flavor better than the other vendors. In that light we can focus on, and talk about, development within individual flavors. And each of these flavors does have a certain impact on the overall trends in the UNIX development.

In its early days, UNIX was primarily run on high and mid-range computers, minicomputers, and relatively powerful workstations (by that time’s standards). The appearance of microcomputers presented a new challenge for UNIX. Microsoft wrote a version of UNIX for microcomputer-based systems. Called XENIX, it was licensed to the Santa Cruz Operation and was closest to System V UNIX. It was later renamed SCO UNIX; later still it merged with Unixware. Other commercial versions also became available, like Unixware, and even Solaris for x86.

However, the main contributor in this area of microcomputer-based UNIX is Linux, a freeshare UNIX available to anyone who wants to try to work in the UNIX arena. Sometimes UNIX for microcomputers is classified as the third UNIX platform. We will treat different UNIX versions for minicomputers as different UNIX flavors related to one of the two main UNIX platforms. In 1993, Linus Travalds released his version of UNIX, called Linux. Linux was a complete rewrite, originally for Intel 80386 architecture. Linux was quickly adopted and “ported” to some other architectures (including Macintosh and PowerPC); currently there are ports of LINUX for practically every single 32- and 64-bit machine available.

Today it is very difficult to differentiate between microcomputers and workstations; the boundaries between them are indistinct. Tremendous IT development has made very powerful IT resources available at low prices. This burst of activity had a very positive impact on UNIX, too — the number of installed UNIX sites rose dramatically, more people were involved in UNIX, and new application areas were conquered. The best example of this IT booming is the Internet, which primarily relies on UNIX-based servers. A thorough knowledge of UNIX has become a prerequisite for any real success in IT. Figure 1.3 presents the main stages of the UNIX genealogy, showing mutual impacts among the different stages and within and out of the discussed UNIX platforms. For a fuller picture, this figure should continue with the list of today’s available UNIX flavors presented in Figure 1.4. (Note: Figure 1.4 is only a partial list of the many UNIX flavors currently in use, and in no way indicates the extent of the individual
flavor’s usage.)

UNIX System and Network Administration
Organizations that rely on computing resources to carry out their mission have always depended on systems administration and systems administrators. The dramatic increase in the number and size of distributed networks of workstations in recent years has created a tremendous demand for more, and better trained, systems administrators. Understanding of the profession of systems administration on the part of employers, however, has not kept pace with the growth in the number of systems administrators or with the growth in complexity of system administration tasks. Both at sites with a long history of using computing resources and at sites into which computers have only recently been introduced, system administrators sometimes face perception problems that present serious obstacles to their successfully carrying out their duties.

Systems administration is a widely varied task. The best systems administrators are generalists; they can
  • wire and repair cables
  • install new software
  • repair bugs
  • train users
  • offer tips for increased productivity across areas from word processing to CAD tools
  • evaluate new hardware and software
  • automate a myriad of mundane tasks, and
  • increasework flow at their site.
In general, systems administrators enable people to exploit computers at a level that gains leverage for the entire organization.

Employers frequently fail to understand the background that systems administrators bring to their task. Because systems administration draws on knowledge from many fields, and because it has only recently begun to be taught at a few institutions of higher education, systems administrators may come from a wide range of academic backgrounds. Most get their skills through on-the-job training by apprenticing themselves to a more experienced mentor. Although the system of informal education by apprenticeship has been extremely effective in producing skilled systems administrators, it is poorly understood by employers and hiring managers, who tend to focus on credentials to the exclusion of other factors when making personnel decisions.

System administrators are the professionals that provide specific services in the system software arena. These professionals are often known by their acronym SYSADMIN. A system
administrator performs various tasks while taking care of multiple, often heterogeneous, computer systems in an attempt to keep them operational. When computer systems are connected to the network, which is almost always the case today, the system administration also includes network-related duties.

UNIX administrators are part of the larger family of the system administrators. Their working platform is UNIX, and it caries many specific elements that make this job unique. UNIX is a powerful and open operating system. As with any other software system, it requires a certain level of customization (we prefer the term “configuration”) and maintenance at each site where it is implemented. To configure and maintain an operating system is a serious business; in the case of UNIX it can be a tough and sometimes frustrating job. Why is UNIX so demanding? Here are some observations:
  • A powerful system means there are many possibilities for setting the system configuration.
  • An open system results in permanent upgrades with direct impacts on administrative issues.
  • UNIX is implemented at the most mission critical points, where a downtime is not allowed.
  • Networking presents a new challenge, but also a new area of potential problems.
  • Different UNIX flavors bring additional system administration difficulties.
Networking in particular, with its many potential external failures, can affect a UNIX system significantly. Periodical global network degradation (too high of a load, low throughput, or even breaks in communication) can cause complex problems and bring a lot of headaches. It is easy to be misguided in tracing a problem, and to be looking for the source of troubles at the wrong place. Usually at such times everyone is looking to the UNIX people for a quick solution. The only advice is: “Be ready for such situations.”

As a matter of fact, system and network administration are relatively distinct duties, and sometimes they are even treated separately. However, it is very common to look at system and network administration as two halves of the same job, with the same individuals or team responsible for both. It is fair to say that the term network administration is strictly related to the computer system as part of the network, and remains within the network service boundaries required for the computer functioning in the network environment. It does not cover core network elements like switches, bridges, hubs, routers, and other network-only devices. Nevertheless, the basic understanding of these topics also could make overall administration easier.

So to get to the heart of the topic, let us start with a brief discussion of the administrator’s role, duties, guidelines, policies, and other topics that make up the SYSADMIN business. Most of the paragraphs that follow are not strictly UNIX related, although our focus remains on UNIX systems and network administration.

System Administrator’s Job
Understanding system administrators’ background, training, and the kind of job performance to be expected is challenging; too often, employers fall back into (mis)using the job classifications with which they are familiar. These job classification problems are exacerbated by the scarcity of job descriptions for systems administrators.
  • One frequently used misclassification is that of programmer or software engineer. Production of code is the primary responsibility of programmers, not of the systems administrator. Thus, systems administrators classified as programmers often receive poor evaluations for not being“productive” enough.
  • Another common misclassification is the confusion of systems administrators with operators. Especially at smaller sites, where systems administrators themselves have to perform many of the functions normally assigned to operators at larger sites, system administrators are forced to contend with the false assumption they are non professional technicians. This, in turn, makes it very difficult for systems administrators to be compensated commensurate with their skill and experience.
The following text lists the main elements that describe the system administrator’s job at various levels. The basic intention is to describe the core attributes of systems administrators at various levels of job performance, and to address site-specific needs or special areas of expertise that a systems administrator may have. Generally, as for many other professions, system administrators are classified regarding their background and experience into several categories:
  • Novices

    • Required background: 2 years of college or equivalent post-high-school education or experience
    • Desirable background: a degree or certificate in computer science or a related field. Previous experience in customer support, computer operations, system administration, or another related area; motivated to advance in the profession
    • Duties: performs routine tasks under the direct supervision of a more experienced system administrator; acts as a front-line interface to users, accepting trouble reports and dispatching them to appropriate system administrators

  • Junior
    • Required background: 1 to 3 years system administration experience
    • Desirable background: a degree in computer science or a related field, familiarity with networked/distributed computing environment concepts (for example, can use the route command, add a workstation to a network, and mount remote filesystems); ability to write scripts in some administrative language (Tk, Perl, a shell); programming experience in any applicable language
    • Duties: administers a small site alone or assists in the administration of a larger system; works under the general supervision of a system administrator or computer systems manager

  • Intermediate/Advanced
    • Required background: three to five years’ systems administration experience
    • Desirable background: a degree in computer science or a related field; significant programming background in any applicable language
    • Duties: receives general instructions for new responsibilities from supervisor; administers a midsized site alone or assists in the administration of a larger site; initiates some new responsibilities and helps to plan for the future of the site/ network; manages novice system administrators or operators; evaluates and/or recommends purchases; has strong influence on purchasing process •

  • Senior
    • Required background: more than five years previous systems administration experience
    • Desirable background: a degree in computer science or a related field; extensive programming background in any applicable language; publications within the field of system administration
    • Duties: designs/implements complex LAN and WANs; manages a large site or network; works under general direction from senior management; establishes/ recommends policies on system use and services; provides technical lead and/or supervises system administrators, system programmers, or others of equivalent seniority; has purchasing authority and responsibility for purchase justification

This is a general job classification and description for potential UNIX administrators. It can easily vary from one site to another, especially regarding official job titles. A number of other skills could also be considered:
  • Interpersonal and communication skills; ability to write proposals or papers, act as a vendor liaison, make presentations to customer or client audiences or professional peers, and work closely with upper management
  • Ability to solve problems quickly and completely; ability to identify tasks that require automation and automate them
  • A solid understanding of a UNIX-based operating system, including paging and swapping, inter-process communication, devices and what device drivers do, filesystem concepts (inode, superblock), and use of performance analysis to tune systems
  • Experience with more than one UNIX-based operating system; with sites running more than one UNIX-based operating system; with both System V and BSD-based UNIX operating systems; with non-UNIX operating systems (for example, MS-DOS, Macintosh OS, or VMS); and with internetworking UNIX and other operating systems (MS-DOS, Macintosh OS, VMS)
  • Programming experience in an administrative language (shell, Perl, Tk); extensive programming experience in any applicable language
  • Networking skills — a solid understanding of networking/distributed computing environment concepts, principles of routing, client/server programming, and the design of consistent networkwide filesystem layouts; experience in configuring network filesystems (for example, NFS, RFS, or AFS), in network file synchronization schemes (for example, rdist and track), and in configuring automounters, license managers, and NIS; experience with TCP/IP networking protocols (ability to debug and program at the network level), with non-TCP/IP networking protocols (for example, OSI, Chaosnet, DECnet, Appletalk, Novell Netware, Banyan Vines), with high-speed networking (for example, FDDI, ATM, or SONET), with complex TCP/IP networks (networks that contain routers), and with highly complex TCP/IP networks (networks that contain multiple routers and multiple media); experience configuring and maintaining routers and maintaining a sitewide modem pool/terminal servers; experience with X terminals and with dial-up networking (for example, SLIP, PPP, or UUCP); experience at a site that is connected to the Internet, experience installing/configuring DNS/BIND; experience installing/administering Usenet news, and experience as postmaster of a site with external connections
  • Experience with network security (for example, building firewalls, deploying authentication systems, or applying cryptography to network applications); with classified computing; with multilevel classified environments; and with host security (for example, passwords, uids/gids, file permissions, filesystem integrity, use of security packages)
  • Experience at sites with over 1000 computers, over 1000 users, or over a petabyte of disk space; experience with supercomputers; experience coordinating multiple independent computer facilities (for example, working for the central group at a large company or university); experience with a site with 100% uptime requirement; experience developing/implementing a site disaster recovery plan; and experience with a site requiring charge-back accounting
  • Background in technical publications, documentation, or desktop publishing
  • Experience using relational databases; using a database SQL language; and programming in a database query language; previous experience as a database administrator
  • Experience with hardware: installing and maintaining the network cabling in use at the site, installing boards and memory into systems; setting up and installing SCSI devices; installing/configuring peripherals (for example, disks, modems, printers, or data acquisition devices); and making board-level and component-level diagnosis and repairing computer systems
  • Budget responsibility, experience with writing personnel reviews and ranking processes; and experience in interviewing/hiring
Do not be afraid of this long list of additional requirements. Nobody expects UNIX systems and network administrators to be Supermen. UNIX administration is a normal job that is demanding but definitely doable.

To end this discussion, here is a joke about UNIX administrators. Consider the similarities between Santa Claus and UNIX administrators(LOL):
  • Santa is bearded, corpulent, and dresses funny
  • When you ask Santa for something, the odds of receiving what you wanted are infinitesimal
  • Santa seldom answers your mail
  • When you ask Santa where he gets all the stuff he has, he says, “Elves make it for me.”
  • Santa does not care about your deadlines
  • Your parents ascribed supernatural powers to Santa, but did all the work themselves
  • Nobody knows who Santa has to answer to for his actions
  • Santa laughs entirely too much
  • Santa thinks nothing of breaking into your HOME
  • Only a lunatic says bad things about Santa in his presence

Computing Policies
A successful system administration requires a well-defined framework. This framework is described by the corresponding computing policies within the organization where the administration is provided. There are no general computing policies; they are always site specific. Drafting computing policies, however, is often a difficult task, fraught with legal, political, and ethical questions and possibly consequences. There are a number of related issues:
  • why a site needs computing policies
  • what a policy document should contain
  • who should draft it, and to whom it should apply
There is no a unique list of all possible rules. Each computing site is different and needs its own set of policies to suit specific needs. The goal here is to point out the main computing policies that directly influence the system administration. This is not possible without addressing security and overall business policies as they relate to computing facilities and their use.

Good computing policies include comprehensive coverage of computer security. However, the full scope of security, overall business, and other policies goes well beyond computer use and sometimes may be better addressed in separate documents. For example, a comprehensive security document should address employee identification systems, guards, building structure, and other such topics that have no association with computing.
Computing security is a subset of overall security as well as a subset of overall computing policy.
If there are separate policy documents, they should refer to each other as appropriate and should not contain excessive redundancy. Redundancy leaves room for later inconsistencies and increases the work of document maintenance.

The system administrator policy usually is not completely separated from the user policy. In practice there are few if any user policies from which a system administrator needs to be exempt. System administrators are users and should be held accountable to te same user policy as everyone else in the use of their personal computer accounts. System administrators (and any other users with “extended” system access) have additional usage responsibilities and limitations regarding that extended access, i.e., extra powers via groups or root. The additional policies should address the extended access. Further, knowledge of policies governing how staff members perform their duties (e.g. how frequently backups are done) is essential to the users. All the information on the operation of the computing facility should be documented and available to both the end users and the support staff to prevent confusion and redundancy as well as enhance communication. The policy documents should be considered as a single guide for the users and the support staff alike. We intentionally used the words “computing policies” in the plural; it is hard to talk about a unique overall policy that could cover everything needed.

System administration is a technical job. System administrators are supposed to accomplish certain tasks, to implement technical skills to enforce certain decisions based on certain rules. In other words, the system administrator should follow a specific administrative procedure to accomplish the needed task. A system administrator is not supposed to make non-technical decisions, nor dictate the underlying rules. It is important to have feasible procedures, and in that sense, the administrator’s opinion could be significant. But the underlying rules must be primarily based on existing business-driven computing policies.

At the end of the day, we reach the point of asking: “Will a SYSADMIN really have strictly defined procedures in the daily work that will make the administration job easier; especially, would these procedures be in written form?” The most probable answer regarding procedures will be negative. There are usually multiple ways to accomplish a certain administrative task because system configurations are changing (just think about different UNIX flavors, or new releases, or network changes). However this is not the case with computing policies; they are usually general enough to last a longer time.

We already mentioned that the computing policies are business related. They are different in academia than in industry; they are different in the financial industry than in the retail industry, or in the Internet business. They are, at least for a moment, always internal and stay in the boundaries of a college, university, or company. So they can differ by moving from one place to another. Still there are many common elements and we will try to address them.

Security policy — Definitely the most important policy, a good security policy is the best guarantee for uninterrupted business. Clear guidance in that direction is extremely important. Requests for Comments (RFCs) that present standards for new technologies also addressed this important issue. The RFC-2196 named “Site Security Handbook,” a 75-page document written in 1997 by IETF (Internet Engineering Task Force), suggests the need for internal security documents as guidelines for:
  • Purchasing of hardware and software
  • Privacy protection
  • Access to the systems
  • Accountability and responsibility of all participants
  • Authentication rules
  • Availability of systems
  • Maintenance strategy (internal vs. outsourcing)
Policy toward users — Users are main players in the ongoing business, but they must obey certain rules, and they do not have to have unrestricted access to all available resources. It is crucial to define the following user rights and responsibilities:
  • Who is an eligible user
  • Password policy and its enforcement
  • Mutual relationship among users
  • Copyright and license implementation
  • Downloading of software from Internet
  • Misusing e-mail
  • Disrupting services
  • Other illegal activities
Policy toward privileged users — The primary audience for this policy is SYSADMIN and other privileged users. These users have unrestricted access to all system resources and practically unlimited power over the systems. The policy addresses:
  • Password policy and its enforcement
  • Protection of user privacy
  • License implementation
  • Copyright implementation
  • Loyalty and obedience
  • Telecommuting
  • Monitoring of system activities
  • Highest security precaution and checkup
  • Business-time and off-business-time work
Emergency and disaster policies — Good policies mean prevention and faster recoveries
from disaster situations. They are essential to maintain system availability and justify spending an appropriate amount of time to protect against future disastrous scenarios. Data are priceless, and their loss could be fatal for overall business. Emergency and disaster policies include:
  • Monitoring strategies
  • Work in shifts
  • Tools
  • Planning
  • Distribution of information (pager, beepers, phones)
  • Personnel
Backup and recovery policyThis is a must for each system — in the middle of disastrous situations, there is no bargaining regarding the need for backup. However, the
level and frequency of implemented backup vary and are business related. Generally the
policy should address the following issues:
  • Backup procedures
  • Backup planning
  • Backup organization
  • Storage of backup tapes
  • Retention periods
  • Archiving
  • Tools
  • Recovery procedures
Development policy — This policy should address the need for permanent development and upgrading of the production systems. Today continual development of the IT infrastructure is essential for overall business growth; however, the development should not
endanger basic production. In that light, the focus should be on:
  • Development team
  • Planning
  • Support
  • Testing
  • Staging
  • Cutting new releases
  • Fallback
System administration will be easier if more computing policies are covered and elaborated internally and if more of the corresponding procedures are specified. It sounds strange, but less freedom in doing something usually makes the job easier. Unfortunately (or maybe fortunately) this is mostly the case only for large communities with strong IT departments that have been running for years. The majority of medium-size and small companies do not have, or have only rudimentary, specified procedure. The system administrator often does have freedom in enforcing listed policies. This freedom in action increases the administrator’s responsibility, but also enhances the creativity in the work (that is why we used the word “fortunately” earlier).

Administration Guidelines

Legal Acts
Computer network and UNIX are quite young, but they have significantly affected all spheres of human life. Today the Internet is strongly pushing ahead to replace, or at least to alter, many traditional pieces of economic infrastructures:
  • the telecommunication industry
  • the entertainment industry
  • the publishing industry
  • the financial industry
  • postal services, and others.
All kinds of middleman services, such as
  • travel agencies
  • job agencies
  • book sellers, and music retainers
are also dramatically changing. Business-to-business (B2B) links are growing, providing an efficient mechanism to merge customers and merchants and make our online shopping easier. The full list of all affected businesses would be very, very long.

Such a huge area of human activities also opened up possibilities for misuse, fraud, theft, and other kinds of crimes. While the technological and financial capabilities have fully supported booming information technologies, legal infrastructure seems to stay far below our real needs. In many cases even when the perpetrator is caught, actual conviction is very difficult under the current laws. Recent cases involving very destructive viruses that cost businesses millions of dollars stayed in limbo even though the perpetrators were known. The case against “Napster Music Community,” relating to music copyrights, was closed after a long time and was only partially successful.

At this moment we have only a few legal acts in this area, covering only several computer-crime-related topics, and sometimes those not even effectively. Definitely they do not constitute a sufficient legal framework, and further improvements and expansions are necessary. The existing legal acts are:
A pending problem in the implementation of the listed legal acts, as well as others that will presumably come in the future, lies in the fact that even if the corresponding laws exist in the United States, they do not exist in many other countries. Because of the global nature of the Internet and its presence in countries worldwide, it is very difficult to enforce any court decision.

Code of Ethics
The lack of general legal guidance, and often the lack of clear internal administration rules and procedures, presents new challenges in the system administrator’s job. More freedom in doing the job also means more chances for wrongdoing. Under such circumstances, an extremely responsible attitude of the administrators toward all these challenges is very important. System administrators, regardless of their title and whether or not they are members of a professional organization, are relied upon to ensure proper operation, support, and protection of the computing assets (hardware, software, networking, etc.). Unlike problems with most earlier technologies, any problem with computer assets may negatively impact millions of users worldwide — thus such protection is more crucial than equivalent roles within other technologies. The ever-increasing reliance upon computers in all parts of society has led to system administrators having access to more information, particularly information of critical importance to the users, thus increasing the impact that any wrongdoing may have. It is important that all computer users and administrators understand the norms and principles to be applied to the task. At the end of the day, we come to the informal set of behavioral codes known as the code of ethics that each administrator should be aware of. A code of ethics supplies these norms and principles as canons of general concepts. Such a code must be applied by individuals, guided by their professional judgment, within the confines of the environment and situation in which they may be. The code sets forth commitments, responsibilities, and requirements of members of the system administration profession within the computing community. The basic purposes of such a code of ethics are:
  • To provide a set of codified guidelines for ethical directions that system administrators must pursue
  • To act as a reference for construction of local site acceptable-use policies
  • To enhance professionalism by promoting ethical behavior
  • To act as an “industry standard” reference of behavior in difficult situations, as well as in common ones
  • To establish a baseline for addressing more complex issues
This code is not a set of enforceable laws, or procedures, or proposed responses to possible administrative situations. It is also not related to sanctions or punishments as consequences of any wrongdoing. A partial overview of one proposal for the code of ethics follows:
  • Code 1: The integrity of a system administrator must be beyond reproach — System administrators must uphold the law and policies as established for the systems and networks they manage, and make all efforts to require the same adherence from the users. Where the law is not clear, or appears to be in conflict with their ethical standards, system administrators must exercise sound judgment and are also obliged to take steps to have the law upgraded or corrected as is possible within their jurisdiction.
  • Code 2: A system administrator shall not unnecessarily infringe upon the rights of users — System administrators will not exercise their special powers to access any private information other than when necessary to their role as system managers, and then only to the degree necessary to perform that role, while remaining within established site policies. Regardless of how it was obtained, system administrators will maintain the confidentiality of all private information.
  • Code 3: Communications of system administrators with all whom they may come in contact shall be kept to the highest standards of professional behavior — System administrators must keep users informed about computing matters that might affect them, such as conditions of acceptable use, sharing and availability of common resources, maintenance of security, occurrence of system monitoring, and any applicable legal obligations. It is incumbent upon the system administrator to ensure that such information is presented in a manner calculated to ensure user awareness and understanding.
  • Code 4: The continuance of professional education is critical to maintaining currency as a system administrator — Since technology in computing continues to make significant strides, a system administrator must take an appropriate level of action to update and enhance personal technical knowledge. Reading, study, acquiring training, and sharing knowledge and experience are requirements to maintaining currency and ensuring the customer base of the advantages and security of advances in the field.
  • Code 5: A system administrator must maintain an exemplary work ethic — System administrators must be tireless in their effort to maintain high levels of quality in their work. Day to day operation in the field of system administration requires significant energy and resiliency. The system administrator is placed in a position of such significant impact upon the business of the organization that the required level of trust can only be maintained by exemplary behavior.
  • Code 6: At all times system administrators must display professionalism in the performance of their duties — All manner of behavior must reflect highly upon the profession as a whole. Dealing with recalcitrant users, upper management,vendors, or other system administrators calls for the utmost patience and care to ensure that mutual respect is never at risk.

There are several UNIX and system administration related organizations, support groups, and conferences. Following are just a few words about the best known ones.
  • USENIX is the advanced computing systems association. This was originally a non-profit membership organization for those individuals with an interest in UNIX, UNIX-related, and other modern operating systems. Since 1975 the USENIX association has brought together the community of engineers, system engineers, system administers, scientists, and technicians. All of these people have been working on the cutting edge of the computing world. The USENIX conferences have become the meeting grounds for presenting and discussing new and advanced information on developments from the computing systems. USENIX is dedicated to sharing ideas and experiences of those working with UNIX and other advanced computing systems. USENIX members are dedicated to solving problems with a practical bias, fostering research that works, communicating with both research and innovation, and providing critical thought. USENIX supports its members’ professional and technical development through a variety
    of ongoing activities, including:

    • Member benefits
    • Annual technical and system administration conferences, as well as informal, specific-topic conferences
    • A highly regarded tutorial program
    • Student programs that include stipends to attend conferences, low student membership fees, best paper awards, scholarships, and research grants
    • Online library with proceedings from each USENIX conference
    • Participation in various IEEE and Open Group standards efforts
    • International programs
    • Cosponsorship of conferences by foreign technical groups
    • Prestigious annual awards which recognize public service and technical excellence
    • Membership in the Computing Research Association and the Open Group
    • SAGE, a Special Technical Group (STG) for system administrators

  • System Administrators Guild — SAGE At the moment the System Administrators’ Guild (SAGE), is a Special Technical Group (STG) of the USENIX Association. It is organized to help advance computer systems administration as a profession, establish standards of professional excellence and recognize those who attain them, develop guidelines for improving technical capabilities, and promote activities that advance the state of the art of the community. SAGE members are also members of USENIX.Since its inception in 1992, SAGE has grown immensely and has matured into a stable community of system administration professionals. Organization management has been codified and stabilized. As an USENIX STG, reviews by USENIX are scheduled periodically, principally for assessing continued viability. SAGE’s viability has not been an issue for some
    time — quite the opposite, the growth of SAGE has exceeded reasonable expectations and those of USENIX as a whole. At this point in SAGE’s development, it is prudent for both SAGE and USENIX to review organizational structures, their relationships, and future developments. To that end, the SAGE executive committee reviewed the existing mission statement, its relevance for the present and the future, and the future interests and projects as they relate to that mission. While the existing SAGE Charter and Mission Statement are still relevant, the following text was adopted as a working draft that better expresses its current nature and future:

    The System Administrators Guild is an international professional organization for people involved in the practice, study, and teaching of computer and network system administration. Its principal roles are:
    • To always understand and satisfy the needs of system administrators so as to provide them with products and services that will help them be better system administrators
    • To empower system administrators through information, education, relationships, and resources that will enrich their professional development and careers
    • To advance the thought, application, and ethical practice of system administration
    As SAGE grows, the majority of its members will be professionals who are not currently involved with SAGE. This will come as a result of the growing awareness of SAGE, different certification programs, and other future projects.
    The SAGE executive committee, the USENIX board of directors, and USENIX staffs have discussed how to meet the growing needs of SAGE. At this time, there are ideas that these needs may be better met by changing SAGE from a USENIX internal STG to a sister organization established as an independent nonprofit entity. If this process continues as expected, this transition could be implemented soon. The SAGE executive committee to be elected will become the initial board of directors of SAGE. The precise legal structure and implementation details are yet to be determined.
    In this plan, SAGE will continue to serve its members with the benefits with which they have become accustomed. SAGE member services and information will move to a more electronic community model. SAGE will publish its own newsletter while SAGE news will continue to be available as before. LISA will continue to be cosponsored by USENIX and SAGE. SAGE will also sponsor new conferences and programs to reach out to the broader system and network administration community. All the assets of USENIX used exclusively by SAGE will be transferred to the independent SAGE organization, including intellectual property, inventory, and current operating funds. SAGE will then operate independently from USENIX. The LISA conference will continue without change, being operated by USENIX and cosponsored by SAGE. The responsibility for all current and pending SAGE projects will also be transferred. Membership in USENIX and SAGE will be decoupled such that a person can become a member of SAGE without having to become a USENIX member. However, SAGE and USENIX will continue to provide close cooperation and mutual benefits to their members.

  • Conferences One of the ongoing activities of USENIX and SAGE is to organize UNIX and UNIX administration-related annual and ad hoc conferences. The big events for system administrators include the general conference LISA, which is organized every year during the fall or the winter. For example, LISA ’02 is scheduled for November 2002 in Philadelphia, PA. LISA stands for Large Installation System Administration.

    LISA is more than just an exchange of technical topics. This is also the place where many system administration issues are generated, including essential ones for the sysadmin community. For example, the initial idea for an independent SAGE was born and presents
    the state of the discussions as of LISA 2000.

There are no explicit standards regarding UNIX administration. There are no standards regarding system administration generally. Anyhow, administrators are obliged to follow
a strict set of rules to make the system function properly. These rules were, and are, determined by the OS designers. Although they are not official standards, they have an even stronger impact on the system administration; otherwise a system will not work at all. The problem is, at least in case of the UNIX administration, different administrative rules exist for different UNIX flavors. It makes our lives more difficult, and any standardization in that way will be well received by the administrators.

In the UNIX and network arena there are significant efforts toward standardization. There are several standards bodies, both formal and informal. Each body has different rules for membership, voting, and clout. From a system administration standpoint, two significant bodies are: IETF (Internet engineering task force) and POSIX (portable operating system interfaces). Especially POSIX has contributed a lot in the area of UNIX standardization, making also a corresponding ground for its uniform and more standardized administration.

POSIX The POSIX standardization effort used to run by the POSIX standards committee. During a major overhaul of the names and numbers used to refer to this project, the IEEE Portable Application Standards Committee (PASC) came into being. So currently the POSIX standards are written and maintained by PASC.

POSIX is the term for a suite of applications program interface(API's) standards to provide for
the portability of source code applications where operating systems services are required. POSIX is based on the UNIX operating system (UNIX is registered trademark administrated by the Open Group), and is the basis for the Single UNIX Specification (SUS) from the Open Group. Although it is essentially based on UNIX (and the kernels services), it covers much more than just UNIX (Windows NT can be made to be POSIX compliant).

POSIX is intended to be one part of the suite of standards (a “profile”) that a user might require of a complete and coherent open system.
This concept is developed in IEEE Std. 1003.0–1994: Guide to the POSIX Open System Environment.
The joint revision to POSIX and the Single UNIX Specification, involving the IEEE PASC committee, ISO Working Group WG15, and the Open Group (informally known as the Austin Group), is underway. More information, including draft specifications, can be found at the Austin Group Web site.

The PASC continues to develop the POSIX standards. In accordance with a synchronization plan adopted by the IEEE and ISO/IEC JTC1, many of the POSIX standards become international standards shortly after their adoption by the IEEE. Therefore, these standards
are available in printed form from both IEEE and ISO, as well as from many national standards organizations. Approved standards can also be purchased from the IEEE in electronic (PDF) format. The IEEE also publishes Standards Interpretations for many of the standards (more details are available at IEEE Web site).

Cooperation among IEEE, the Open Group (X/Open), and ISO is now underway for the common UNIX/POSIX standard. Everybody can participate in the process (see the Austin Group Web site). A revision of the whole suite of UNIX and POSIX standards is going on. The plan is to make just one document, based on the UNIX 98 Single UNIX Specification, and the same document will serve as the standard in all three of the participating organizations. It is not clear, though, whether the name on the standard will be UNIX or POSIX.

POSIX System Interface standards cover those functions that are needed for applications software portability in general purpose, real time, and other applications environments. Many of the extensions and options within the POSIX system interface standards reflect the ongoing focus on more demanding applications domains such as embedded real time, etc. Interfaces that require administration privileges, or that create security risks are not included. The POSIX work consists of:
  • System interface specifications for C, ADA, and FORTRAN
  • Shell and utility specification
  • System administration specifications for software installation, user administration, and print management
  • Test methods: general methods, for system interfaces, and for shell and utilities
  • Profiles documents: guide to POSIX-based profiles (concepts); supercomputing application environment, real-time application environment, multiprocessing environment, and general purpose or “traditional” environment
The POSIX shell and utility standards define tools that are available for invocation by applications software, or by a user from a keyboard. The system administration interfaces are targeted at areas where consistency of interfaces between systems is important to simplify operations for both users and systems operators. The POSIX test methods describe how to define test methods for interfaces such as those in the POSIX suite of standards. The explicit test methods for the system interface and shell and utilities standards apply the approach defined in the overview to these specific documents.

Despite many promises, wishes, advertisements, and attempts to standardize UNIX, the differences among existing UNIX favors are not negligible. The differences exist in UNIX implementations, but the main differences are seen in the UNIX administration.

The UNIX Model — Selected Topics

UNIX administration presents a complex job that requires certain skills to be accomplished successfully. These skills range from:
  • a basic knowledge of computer hardware
  • operating systems, and
  • programming techniques
  • .... up to
  • ethics
  • psychology, and
  • social behavior
It supposes a responsible approach to very challenging problems, and a readiness for a non-stop follow-up of everything done. An administrator usually covers many different systems
  • different hardware
  • different configurations
  • different software
  • different purposes
and each of those systems is the “baby” that requires a certain amount of attention, and the administrator must pay that attention.

Of course the level of the required skills varies; it would be wrong to expect that an UNIX administrator (especially a successful one) has to graduate in each of the listed fields to be able to respond to all administrative demands. However, it is true that some of the required skills need more than just a basic knowledge;
  • mostly these are strictly UNIX-related skills. Nobody can fight with UNIX administrative challenges without being familiar with the UNIX operating system, the UNIX commands and how to use them. An even deeper expertise in UNIX internals could be very instrumental in an easier UNIX administration.
  • Script programming is another fighting arena. An average UNIX administration time consists of 75 to 80% of shell programming, and only the rest is a manual administration from the keyboard.
Here i simply trying to highlight the needed background for a comprehensive UNIX administration. Another purpose is to present in one place most of the relevant UNIX fundamentals needed for better understanding of different administrative tasks. The terminology used is common in the UNIX community.

In UNIX everything is a file, or rather, file-like — this makes file issues central to UNIX. What does this really mean?
  • A file is a collection of data, or, better, a sequence of bytes, stored in a memory or on a disk. A file can be a program that can be executed. When such a program is running, it creates a process. Therefore, a file lies in the origin of every process.
  • On UNIX each device is also described by a file — these are called special device files, but are still file-like entities.
  • Even users on UNIX are file related, as they have associated attributes (such as what they are allowed access to) that are specified in a file-like way.
UNIX has a hierarchical tree-structured directory organization known collectively as the filesystem.
  • The base of this tree is the root directory with the special name “/” (the slash character).
  • In UNIX all user-available disk space is integrated into a single directory tree under /, so the physical disk unit (the disk drive itself) where a file resides is not a part of the UNIX file specification.
We already mentioned that a file is a sequence of bytes. Such a sequence could be a newly created user’s program, written text, acquired data, or a program that is a part of the operating system itself. Many files are understandable by users, but a number of files (mostly binary executable files) are machine-interpretable only. All files, no matter what their purpose, must be stored somewhere and uniquely identified within the system.
  • A disk is the most common medium to store files, and
  • files are identified by inodes (a file's inode number can be found, on POSIX complaint platforms, using the ls -i command or on on POSIX complaint platforms trough find with option -inum, while the ls -l command will retrieve inode information (i.e. the file information)) within accessible disk space.
  • The kernel handles information about inodes and maintains and updates the corresponding inode table (the inode table is laid out when a filesystem is created and its size and location do not change).
UNIX file access is restricted and determined by:
  • file ownership and
  • the protection settings on the file itself.
A user and a group own each file; correspondingly, the file’s access rights for the user and group owners, as well as for others, (those who do not belong to the owners) are explicitly specified.

File Ownership
Files have two owners: user and group, which are decoupled and non-dependent. The file’s user-owner could actually be outside of the group that owns the very same file. Such flexibility enables full UNIX scalability to exclude certain members of the user-owner’s group and treat them as others.

Information about a file’s ownership and permissions is kept in the file’s index node (inode). UNIX does not allow direct managing of index nodes; indirect management is provided through a certain number of commands that handle specific segments of the index nodes. A brief overview of the most common of these commands follows.

The long form of the ls command is used to display the ownership of a file or a directory, with a slightly different meaning of options for System V and BSD UNIX:

# ls-l       (System V)
# ls-lg      (BSD)

The system response looks like:

drwx------         2 bjl mail         24 Mar 24 13:19 Mail
-rw-rw-rw-         1 bjl users        20 May 2 13:26 modefile1

The file ownerships are presented in the (from left) second column (for a user-owner), and third column (for a group-owner). In this example, file modefiles1 is owned by the user bjl (sixth column) and the group users (seventh column).

Ownership of a newly created file is determined in the following way:
  • The user-owner is the user who has created the file
  • The group-owner is:
    • Same as the group-owner for the directory where the new file was created (for BSD)
    • Same as the group to which the user who created the file belongs (for System V)

Please note that this rule only applies to newly created files; once a file is created, its ownership can be arbitrarily modified. The chown command is used to change the user ownership of a file or a directory:

# chown newowner filename(s)

  • newowner A user name, or user-ID (UID)
  • filename A file name in the current directory, or a full-path file name (if multiple files are specified, they are separated by a space)
Directories are treated in the same way as files; to change the user ownership of a directory itself, type the command:

# chown newowner directoryname(s)

  • newowner A user name, or user-ID (UID)
  • directoryname A subdirectory name in the current directory, or a full-path directory name (if multiple directories are specified, they are separated by a space).
However, to change the user ownership of a directory and all subdirectories and files within it, the chown command should be used recursively (the option -R):

# chown -R newowner directoryname(s)

(The command arguments are the same as those in the previous example.)

Who is authorized to change the user ownership?
  • user-owner of the file, or root (System V)
  • root only (BSD)
Please note that on the System V platform, if the original user-owner transfers user ownership to another user, it can only be transferred back to the original user-owner by the new user who now owns the file, or by root.
Also, such a change of ownership is restricted: some access rights cannot be transferred to the new user
Generally, each recursive command must be accomplished extremely carefully; the started command does not stay within the specified directory; it is propagated toward all existing subdirectories, files in these subdirectories, subsequent subdirectories, and so on, until the very end of the directory hierarchy (could be very, very deep). If implemented in the root directory, each recursive command affects every single file in the system.
Try to remember an unpleasant event when an administrator wanted to change recursively the owner for a certain directory (of course the administrator did that as the superuser). The administrator typed in the command and started to specify the full pathname of the directory; unfortunately the administrator hit unintentionally the [Enter] key too early, just after the leading “/” (slash character) of the directory path was typed. The disastrous command:

# chown -R newuser /

was issued, causing recursive changes of many system files, and soon a collapse of the system. The only solution was to reinstall and restore the system from a backup (if such a backup is available at all).

The chgrp command is used to change the group ownership of a file or a directory:

# chgrp newgroup filename(s)/directoryname(s)
# chgrp -R newgroup directoryname(s)

  • newgroup A group name, or a group-ID (GID)
  • filename A file name in the current directory, or a full-path file name
  • directoryname A subdirectory name in the current directory, or a full-path directory name (multiple names are separated by a space)
To change the group ownership of a directory, and all subdirectories and files within it, the chgrp command should be used recursively (the option -R):

Who is authorized to change the group ownership?
  • user-owner of the file, or root
Originally, the BSD UNIX allowed simultaneous changes of the file’s user and group ownership, using the same chown command in the following way:

# chown newowner.newgroup filename(s)
# chown -R newowner.newgroup directoryname

  • newowner A user name, or an UID
  • newgroup A group name, or a GID
  • filename A file name in the current directory or a full-path file name
  • directoryname A subdirectory name in the current directory, or a full-path directory name
Today, most modern UNIX flavors (whether BSD- or System V-derived) accept this useful idea and allow the same simultaneous change, with slightly different syntax:

# chown newowner:newgroup filename(s)
# chown -R newowner:newgroup directoryname

Instead of a dot (.) that was originally used as a separator between the new user and group name, now the colon (:) is introduced.

File Protection/File Access (File mode)
First, let us introduce the terminology we will use to identify access rights to a certain file. We will use three different terms that are related to the very same issue:
  • file protection
  • file access and
  • file permissions
These three terms are mutually related, and their use is primarily dependent upon the angle from which we are viewing the issue. Though
  • file access and file permissions are directly proportional, and we often use the composite term access permissions (more file permissions permit wider access to the file),
  • file access and file protection are inversely proportional (a higher file protection requires more restricted file access).
Finally, they are all known as the file mode.

Every file has a set of permission bits that determine
  • who has access to the file, and
  • the type of access they have.
UNIX supports three types of file access:

The following table lists the permissions required to perform some of the most common UNIX commands.

Access Classes

UNIX defines three basic classes of access to files, for which permissions can be specified separately:
  • User access (u) Access granted to the user-owner of the file
  • Group access (g) Access granted to members of the group that owns the file
  • Other access (o) Access granted to everyone else (except root)
  • All classes (a) Access granted to everyone (includes all three classes)
The access classes independently specify file modes for different categories (classes) of users. The long format (the “-l” option) of the ls command is used to display the file mode — see the previous example. The first column in the listing, a set of letters and hyphens, represents a file mode; the file mode includes three triplets for the three access classes u, g, and o. This is illustrated in the following table:

Setting a File Protection

We have already discussed myriad terms to refer to file protection; UNIX simply refers to a file protection as file mode. In UNIX parlance, to set file permissions means to change a file mode; for that purpose, the UNIX chmod command is used:

# chmod access-string filename(s)

access-string Includes:
  • Access class: u, g, o, or a
  • Operator: +, -, or =
  • Permissions: r, w, or x
  • filename File name in the current directory, or the full-path file name (multiple files are separated by a space).
Multiple access classes and/or permissions could be also simultaneously specified. The recursive chmod command is also supported, for example:

# chmod -R go-rwx /home/username

This command will change the file mode of all files and subdirectories beneath the directory /home/username. It will deny any kind of access for group and other, and the user access will remain unchanged.

This example specifies the file mode, using what is called symbolic mode notation. Alternatively, the absolute, or numeric, mode notation could be also used. The difference between the two is shown below:

set file permissions means to change a file modeThe command to set this particular file mode is:

# chmod 754 filename

Access rights for a certain user are strictly determined by the individual permissions within the related class. It means that UNIX first determines where the user belongs – is that the user-owner, a member of the group-owner, or any other user. Once it is done, only the related file’s access class is checked and accordingly a needed access to the file granted or denied. There is no a gradual top-down access class checkup in the cases when an user belongs to multiple classes (an user-owner could also be a member of the group-owner, and definitely belongs to others). Here is an example:

The user is bjl; the long listing for the text file textfile is:

$ ls -l testfile
-rw-r--r-- 1 bjl users 15          Jul 6 20:49    textfile

With the following content:

$ cat textfile
# This is just a test file

Let us deny read access to the user-owner bjl:

$ chmod u-r testfile

$ ls -l testfile
--w-r--r -- 1 bjl users 15 Jul 6 20:49 textfile

And try to read the file again:

$ cat textfile
cat: textfile: Permission denied

However, the file can be modified

$ echo “# This is added text” >> textfile
$ echo “#” textfile

Besides the fact that user bjl is the owner of the file textfile and a member of the group users, as well as that read permission is granted to the group users and to all others, the file cannot be opened for reading. The file’s owner, user bjl, can modify or delete the file (there is the w permission), but the file cannot be read. To overcome this “unusual situation,” the owner has to change the file mode, and make the file readable.

$ chmod 644 testfile
$ ls –l testfile
-rw-r--r-- 1 bjl users 15 Jul 6 20:49 textfile
$ cat textfile
# This is just a test file
# This is added text

The same is valid for a group-owner toward group permissions.

Default File Mode
The default file mode determines file permissions for newly created files. Once a file is created, the file mode can be changed as desired. UNIX is quite flexible regarding default file mode — there is a coded system setting, and a possibility for a program setting. First of all, the usual system default file modes for directories and files are different:
  • For a directory rwxrwxrwx, i.e., all permissions are granted
  • For a file rw-rw-rw-, i.e., the execute permissions are initially denied
However, do not be surprised if some specific UNIX flavors or even UNIX releases behave differently.

The program setting of the default file mode is always adjusted toward a system setting, and a specified permission can only be denied (never granted); it means only a more restrictive default file mode can be dynamically created. Pay attention that this is related to the default file mode only; the chmod command, or renaming and copying files, are not restricted in that way.

The command umask is used for that purpose. Upon the command execution, all newly created files in the new environment will be automatically set according to the new default file mode. The umask command itself uses numeric notation to specify the default file mode, but in a slightly different way than the chmod command.
The umask command sets permissions to be inhibited (masked out) when a file is created — it denies permissions.
The implemented numeric notation should be an octal complement to the numeric notation of the desired file mode. Old UNIX releases supposed only the numeric notation; modern UNIX flavors allow also the use of the symbolic notation. It is highly recommended to stay familiar with the numeric notation (it works always and everywhere).

For example, to have a default file mode same as the file mode “754” in the previous example:
  • 777 All access granted
  • - 754 Desired access granted
  • 023 Masked out access for default mode
The corresponding command is:

$ umask 023

Additional Access Modes
We have discussed common file permissions, which are quite self-explanatory (read and write are obvious) and relatively easy to use.
Some confusion is possible with respect to the execute (x) permission on a directory, but once we accept execution as a condition to “search the directory -> cd,” everything seems to be reasonable;that is why it is also known as execute/ search permission.
However, the three file permissions (r, w, and x) are far from sufficient to cover all file permission needs in UNIX, and consequently UNIX has to support additional access modes. These additional access modes are listed below:

When using the ls -l command, SUID and SGID access bits are displayed in the position of “x access” for the corresponding access class (SUID in the user class, SGID in the group class); the sticky bit is displayed in the position of x access for the class “others.”

SUID and SGID are extremely important and are very sensitive issues from the system security standpoint. Normally, when an executable file (a program) is invoked, and the corresponding process created, the access rights to system resources of such a process (known as a process’s effective IDs: EUID and EGID) are related to the user and group who started the program execution (known as the process’s real IDs: RUID and RGID). However, if SUID or SGID access is set on an executable file, access to system resources is based upon the file’s user or group owner rather than on the real user who started the program execution. This means, for example, for an executable file owned by the root, regardless of who has started its execution, the program will be executed in the same way as if the superuser had invoked it. (We will discuss this issue in more detail later by addressing process attributes.)

SUID and SGID, as well as a sticky bit, are supposed to be implemented primarily on executable files; however, they could be implemented on any file, as well as on a directory. In such a case, they have different meanings. Here is a summary:

The aforementioned chmod command is used to set additional file modes. Both symbolic and absolute (numeric) notations are supported; however, on some UNIX platform only the symbolic mode notation can be used to clear an SGID bit on a directory. The symbolic notation uses the letter s, together with a corresponding access class to set/clear additional access bits:

# chmod u+s filename     (Set SUID on filename)
# chmod g+s filename     (Set SGID on filename)
# chmod o+s filename     (Set sticky bit on filename)

Alternately, the minus sign (−) is used to clear additional access bits.

An additional, fourth triplet was introduced for the numeric notation; it corresponds to SUID | SGID | sticky, and can be presented numerically, like any other triplet. This additional triplet is the leading one, positioned in front of the other three triplets, and the leading digit in the 4-digit numeric notation identifies it. The 3-digit numeric notation is still valid; UNIX simply assumes 0 for additional access bits (there is no need for a leading zero). The following example should make this clear; it presents the procedure to change a file mode.

The login user is bjl; the current long listing of an arbitrary directory shows:

$ ls -l
drwx------ 2 bjl mail  24 Mar 24 13:19 Mail
-rw-rw-rw- 1 bjl users 20 May 2 13:26  modefile1
-rw-rw-rw- 1 bjl users  20 May 2 13:30 modefile2
-rw-rw-rw- 1 bjl users  20 May 2 13:30 modefile3
-rw-rw-rw- 1 bjl users 322 May 2 13:31 ses1.tmp

The user wants to change the file mode for certain files (the symbolic notation is implemented):

$ chmod u+x modefile1
$ chmod g−w+x modefile2 modefile3
$ ls -l
drwx------ 2 bjl mail    24 Mar 24 13:19 Mail
-rwxrw-rw- 1 bjl users   20 May 2 13:26  modefile1
-rw-r-xrw- 1 bjl users   20 May 2 13:30  modefile2
-rw-r-xrw- 1 bjl users   20 May 2 13:30  modefile3
-rw-rw-rw- 1 bjl users 322 May 2 13:31   ses1.tmp

The required changes in file modes are shown in the new long listing of the directory.
Now let us set SUID and SGID on certain files:

$ chmod u+s modefile1
$ chmod g+s modefile2
$ ls -l
drwx------  2 bjl mail   24 Mar 24 13:19 Mail
-rwsrw-rw- 1 bjl users   20 May 2 13:26  modefile1
-rw-r-srw- 1 bjl users   20 May 2 13:30  modefile2
-rw-r-xrw-  1 bjl users  20 May 2 13:30  modefile3
-rw-rw-rw- 1 bjl users  322 May 2 13:31  ses1.tmp

Pay attention to the displayed position of SUID and SGID bits (they overwrite x permission).
Finally, let us return to the initial file modes:

$ chmod 666 modefile1 modefile2 modefile3
drwx------     2 bjl mail        24 Mar 24 13:19     Mail
-rw-rw-rw- 1 bjl users           20 May 2 13:26      modefile1
-rw-rw-rw- 1 bjl users           20 May 2 13:30      modefile2
-rw-rw-rw- 1 bjl users           20 May 2 13:30      modefile3
-rw-rw-rw- 1 bjl users 322 May 2 13:39               ses1.tmp

Note that SUID and SGID were cleared also; in this case (this is HP-UX flavor), implemented
numeric notation works.

On the System V platform, a user-owner can change the file’s ownership. Practically, it means that a user-owner can give the file to another user, also transferring owner access rights to the new owner. If the SUID or SGID bit is set on the file, such a change of file ownership could be a potential security problem.
It would be very easy to create a particularly nasty scenario that would affect the new owner. Just imagine a simple script that purges the home directory of the new owner, and can be triggered by everybody (there is x permission for others). Once the script ownership was modified, and supposing the SUID is set, whoever starts the script’s execution will appear as the new owner — i.e., the targeted home directory will really be purged (very unpleasant!).
Obviously System V UNIX has to protect itself from such unwelcome surprises. Let us see how in the next example: Three test files are created by the user bjl: testfile1, testfile2, and testfile3.

$ ls -l
-rw-r-----    1   bjl  users  0    May 27 15:07  testfile1
-rw-r-----    1   bjl  users  0    May 27 15:07  testfile2
-rw-r-----    1   bjl  users  0    May 27 15:07  testfile3

The SUID and SGID are set by the user-owner (numeric notation is used):

 $ chmod 4777 testfile1
 $ chmod 2777 testfile2
 $ chmod 4640 testfile3
 $ ls -l
-rwsrwxrwx 1 levi            users   0  May 27 15:07       testfile1
-rwxrwsrwx 1 levi            users   0  May 27 15:07       testfile2
-rwSr-----       1 levi      users   0  May 27 15:07       testfile3

The “set IDs” hide existing “x access bits” in the corresponding access classes. To make the hidden bit recognizable, the low case letter “s” is displayed if both bits “set ID” and “x access bit” are set, and capital letter “S” is displayed if only “set ID” bit is set (pay attention, not for all UNIX flavors). In this example, the file testfile3 is not an executable file. (In that light, SUID on this file does not make a lot of sense, but it is still a good illustration of the previous point.)

The file ownership is now changed by the user-owner:

 $ chown dubey testfile1 testfile2 testfile3
 $ ls-l
-rwxrwxrwx 1 dubey users 0 May 27 15:07                      testfile1
-rwxrwxrwx 1 dubey users 0 May 27 15:07                      testfile2
-rw-r-----       1 dubey users 0 May 27 15:07                testfile3

What happened? We can see that the “set IDs” have not been transferred to the new owner. Simply, if the file ownership was changed by the user-owner for files in which SUID and SGID were set, the file modes would also change — SUID and SGID are not transferable to another user; only the superuser can make it. (Anyhow, the superuser can make whatever it wants.)

Now, let us return everything to the initial state; since the user bjl does not own the files anymore, it will be done by the superuser. First switch to the superuser account:

 $ su
Password: ********
# chown bjl testfile1 testfile2 testfile3
 # su bjl
 $ chmod 640 testfile1 testfile2
 $ ls -l
-rw-r----- 1 bjl users 0 May 27 15:07             testfile1
-rw-r----- 1 bjl users 0 May 27 15:07             testfile2
-rw-r----- 1 bjl users 0 May 27 15:07             testfile3

Note that a switch to the superuser (root) account always requires the root password, while the switch from the superuser to some other user account does not. A superuser already has full control over the system, including all user accounts.

Access Control Lists (ACLs)
File access permissions originate from the early days of UNIX, and they provide enough flexibility in accessing UNIX resources (objects) to meet most daily needs. This approach was made even more flexible by introducing secondary groups as desired, and by grouping individual users on a per need basis. Nevertheless, the continual development and growth in the implementation of UNIX as a platform for different applications required an even more selective approach. Modern UNIX flavors introduced Access Control Lists (ACLs) to respond to new demands.

ACLs are a key enforcement mechanism of discretionary access control (DAC), used to specify access to files by users and groups more selectively than with traditional UNIX mechanisms. ACLs permit or deny access to a list of users, groups, or combinations thereof. ACLs are supported as a superset of the UNIX operating system DAC mechanism for files, directories, and devices.

An access control list is a set of (, mode) entries associated with a file that specify permissions for all possible user-ID/group-ID combinations. An entry in an ACL specifies access rights for one user and group combination. Three bits in an ACL entry represent read, write, and execute-search permissions. These permissions coexist with the traditional mode bits associated with every file in the filesystem.

An individual ACL entry could be considered restrictive or permissive depending on the context.
  • Restrictive entries deny a user and/or group access that would otherwise be granted by less specific base or optional ACL entries.
  • Permissive entries grant a user and/or group access that would otherwise be denied by less specific base or optional ACL entries.
The right to alter ACL entries is granted to file (object) owners and to privileged users.
Privileged users are superusers and members of certain privileged groups
For a better understanding of the relationship between ACLs and traditional file permissions, let us consider the following file and its permissions:

When a file is created, three base access control list entries are mapped from the file’s access permission bits to match the file’s owner and group and its traditional permission bits. The three base ACL entries are:
  1. Base ACL entry for the file’s owner: (uid.%, mode)
  2. Base ACL entry for the file’s group: (%.gid, mode)
  3. Base ACL entry for other users: (%.%, mode)
The basic form of an ACL entry is (, mode). user and group can be represented by names or ID numbers; mode is represented by a letter (r, w, and x if the corresponding access is granted, or dash “- ”if the access is denied). Two special symbols may also be used:
  1. % symbol, representing no specific user or group
  2. @ symbol, representing the current file owner or group
ACLs are superimposed on the file’s traditional permissions; however, managing ACLs does not affect the traditional file mode. There is no way to change the traditional file permissions by using ACL-specific commands (the opposite is not true because base ACL entries are synchronized with the traditional file permissions). Both the traditional UNIX command chmod and ACL-specific commands may be used to change base ACL entries. Optional ACL entries contain additional access control information, which the privileged user can set with the available ACL-specific commands to further allow or deny file access. Up to 13 additional user/group combinations may be specified. For example, the following optional ACL entries could be associated with the presented file datafile:

(mhr.admin, rwx) Grant read, write, and execute access to user mhr in group admin
(mnm.%, ---) Deny any access to user mnm in no specific group (any group)

ACL entries are unique; there can only be one (, mode) entry for any pair of user and group values; one (user.%, mode) entry for a given value of user; one (,mode) entry for a given value of group; and one (%.%, mode) entry for each file.

There are several UNIX commands to manage ACLs, and they are all UNIX-flavor specific. Although they all have essentially the same mission, they have different command names. We will focus on Solaris-specific ACL commands.

The getfacl command is available on Solaris to display discretionary file information:

getfacl [-ad] filename(s)

  • option -a Display the filename, owner, group, and file’s ACL
  • option -d Display the filename, owner, group, and default file’s ACL (if it exists)
  • no option Display the filename, owner, group, file’s ACL, and default file’s ACL (if it exists)
  • filename The filename in the current directory, or full-path filename. (multiple filenames are separated by a space; a blank line separates displayed ACLs)
A few examples (the selected file is /etc/vfstab):

$ getfacl /etc/vfstab
# file: /etc/vfstab            # The first three lines specify the filename, user-owner
                                 and group owner; they start with pound sign (“#”).
# owner: root
# group: other
user::r--                      # Permissions for user-owner (because the second field
is empty).
group::r--     #effective:r -- # Permissions for group owner (because the second field
                                 is empty).
mask:r--                       # Maximum permissions allowed to any user except user-owner,
                                 and to any group (including group owner); they restrict
                                 the permissions specified in other entries.
other:r--                      # Permissions granted to others.

In order to indicate when the group class permission bits restrict an ACL entry, an additional string “#effective:” specifies the actual permissions granted in the same line of the restricted entry; the string is separated by a tab character.

$ cd /etc
$ getfacl vfstab
# file: vfstab                # This is the same command as in the previous example,
                              except that the relative filename was specified.
# owner: root
# group: other
group::r--    #effective: r--

$ getfacl -a vfstab
# file: vfstab                # For this file, the “option –a” and “no options” display
                              the same output because there is no default ACL.
# owner: root
# group: other
group::r--    #effective: r--

$ getfacl -d vfstab
# file: vfstab                # Only the first three lines are displayed because there
                              is no default ACL.
# owner: root
# group: other

The Solaris setfacl command is available to modify an ACL for a file or files. Two forms of the command may be used:

setfacl [-r] [-s | -m | -d ] acl_entries filename(s)
setfacl [-r] [-f] acl_file filename(s)

  • option -r Recalculates the permissions for the file’s group class entry (known as the mask entry). These permissions are ignored and replaced by the maximum permissions needed for the file group class, to grant access to any additional user, owning group, and additional group entries in the ACL. The permissions for these entities remain unchanged.
  • option -s Sets the ACL to the entries specified on the command line; all old ACL entries are removed and replaced with the newly specified ACL.
  • option -m Adds one or more new ACL entries, and/or modifies one or more existing ACL entries; when modified, the specified permissions will replace the current permissions.
  • option -d Deletes one or more ACL entries; the file owner, owning group, and others may not be deleted. Deleting an ACL entry does not necessarily have the same effect as removing all permissions from the entry by modifying the entry itself (an ACL entry superimposes on traditional file permissions).
  • option -f Sets the ACL to the entries contained within the file named acl_file on the command line (see acl_ file); the same constraints on specified entries in the acl_file hold as with -s option. One or more comma-separated ACL entries of the following format (all
    acl_entries entries are not applicable for all options):

    • u[ser]::operm | perm
    • u[ser]:uid:operm | perm
    • g[roup]::operm | perm
    • g[roup]:gid:operm | perm
    • m[ask]:operm | perm
    • d[efault]:u[ser]::operm | perm
    • d[efault]:u[ser]:uid:operm | perm
    • d[efault]:g[roup]::operm | perm
    • d[efault]:g[roup]:gid: operm | perm
    • d[efault]:m[ask]:operm | perm
    • d[efault]:o[ther]:operm | perm
    Where perm is a permissions string composed of the letters r(read), w(write), and x(execute); the dash (-) may be specified as a place holder. operm is an octal representation of the above permissions, 7 -> all permissions (rwx), 0 -> no permissions (---)

  • uid is a login name or user ID; for user-owner is empty
  • gid is a group name or group ID; for group-owner is empty
  • acl_ file The file that contains ACL entries; an ACL entry is specified as a single line. Comments are permitted and they start with pound sign (#). The file can be created as an output of the getfacl command.

File Types

We mentioned earlier that in UNIX everything appears like a file, but there are still differences in the file content and the way the file is managed and processed. These differences result in different kinds of files, or in UNIX terminology, different file types. The type of a file determines how the file will be handled. The long listing of the ls -l command also displays the file type; a leading single letter, or hyphen, in the leftmost position of the first column in the listing that presents the file mode, identifies a file type. The file type is identified in the following way:
- Plain (regular) file
d Directory
c Character special file
b Block special file
l Symbolic link
s Socket
p Named pipe

Plain (Regular) File

A plain file is just a sequence of bytes: a data file, an ASCII file, a binary data file, executable binary program, etc. In most cases when we talk about files, we are thinking of plain files.
They are identified by the hyphen (-) in the long listing of a directory they reside in.


A binary file, a directory is a list of the files within it (including any subdirectories). Entries
are filename-inode pairs. In UNIX each file is identified by an inode (an official name is index node). For simplicity, we will assume that an inode fully specifies the file, and that by knowing the inode, UNIX actually knows everything about the file itself (ownership, mode, type, other properties, contents, location on the disk) except its name. The directory relates the filename with the file itself; the filename-inode pairs that make a content of a directory itself actually establish this relationship. Although it might seem odd to a beginner, UNIX can find a filename only in the corresponding directory. If a directory is corrupted, all of its filenames can be easily lost, while the corresponding files remain unchanged and unnamed.

The special entries “.” and “..” (single and double dots) refer to the directory itself and its parent directory, respectively. A directory in its long listing is identified with the letter d.

Special Device File
A special device file is used to describe the attached I/O device. UNIX accesses devices via their special files. In UNIX, device drivers themselves (software interfaces that control the devices) are part of the kernel, and can be accessed by using certain system calls (UNIX internals). A special device file is a kind of pointer to the corresponding device driver within the kernel; it is a very simple file that contains two pointers: major and minor numbers. The major number points to the device class, while the minor number points to the individual device within the class.

All special device files reside in the directory /dev (and its subdirectories on System V).
There are two groups of special device files: block device files and character device files.
  • Block Device File I/O operations are provided through a group of buffers; the system maintains a buffer pool for all block devices. The block device is accessed in fixed-size blocks. Physically, the high-speed data transfer is realized using a DMA mechanism (direct memory access data transfer). The letter b in the long listing of a directory identifies the block device files. The following disk-related block device files are examples of block device files: /dev/disk0a or /dev/dsk/c1d1s5.
  • Character Device File Non buffered I/O operations are provided via a character or raw device. Physically, the data transfer is performed through a registered data exchange between the device and its controller. Character devices include all devices that do not fit the block I/O transfer. The letter c in the long listing of a directory identifies the character device files. The following disk related raw device files are examples of character special files: /dev/rdisk0a or /dev/rdsk/c1d1s5.

A link is a mechanism that allows multiple filenames to refer to a single file on a disk, i.e., a single inode. There are two kinds of links: hard links and symbolic links.

Hard Link
A hard link associates two or more filenames with an inode; each inode keeps a record of a number of linked filenames. Only when all filenames are deleted will the file itself also be deleted, and the corresponding inode released and returned as free for new file assignments. Strictly speaking, a hard link is not a separate file type; each hard link represents an already existing file with an additional filename. The only way to identify mutually hard-linked filenames is to list a directory or directories by using the “ls -i” command and check for identical inode numbers. The “-i” option displays, beside the filename, the inode number for each displayed file in the listed directory.

Hard links always remain within the same filesystem; simply, inodes cannot be shared between filesystems, and two hard links are always associated with the same inode. A hard link never creates a new file; it only attaches a new filename to the existing file. This means that a hard link only presents a new entry in a directory, a new record about a filename-inode pair. To create a hard link use the ln command:

ln myfile hardlink

This command will create a new entry in the current directory named hardlink paired with the same inode number as myfile. There are no hard links for directories; it would be too confusing and dangerous for the system.

Symbolic Link
A symbolic link is a pointer file to another file elsewhere in the overall hierarchical directory tree. By creating a symbolic link, a new small file is also created; this new file contains the full-path filename of the linked file. There is no restriction on the use of symbolic links; they span filesystem boundaries independently of the origin of the linked file. Symbolic links are very common (this cannot be said for hard links); they are easy to create, easy to maintain and easy to see. The letter l in the long listing of a directory identifies them; a linked file is also displayed in a visually comprehensive way. To create a symbolic link use also the ln command (with the option -s):

ln -s myfile symlink

This command creates another file named symlink in the current directory with a separate inode (since this is a completely new file) that points to the file myfile. Both types of links are presented in Figure 2.1. Let me explain it in more detail.

For an existing file named myname, which is determined by the inode (index node) N1, both links are created.
  • The hard link hardlink is another name for the file myfile, and it corresponds to the same inode N1.
  • The symbolic link symlink represents another file determined by the inode N2; its contents point to the file myfile.
What will happen if the file myfile is deleted? Actually, only the filename “myfile” will be deleted; the file itself remains with its other name hardlink (the file content remains unchanged). The symbolic link symlink is now broken; it points nowhere (there is no more referenced file myfile).

What will happen if another file named myfile is created in the same directory? This is a brand new file, determined by the new index node N3 and unrelated to the existing file hardlink, which continues to exist as a different file. However, the file symlink is now linked with the new file myname, and it continues to point to the newly created file myfile.

A special type of file used for interprocess communication on a single system or between different systems; sockets enable connection between processes. There are several kinds of
sockets, and most of them are involved in network communications.

UNIX domain sockets are local ones, used in local interprocess communication; they are referenced as filesystem objects. Sockets are created by the use of a special system call, “socket”, but can be treated in a similar way as other files (using the same system calls).
However, a socket can be read or written only by processes directly involved in the connection. For example, printing systems, X windowing, or error system logging use sockets. Sockets were originally developed in BSD and later included in System V. The most probable place to find sockets is the /tmp directory.

Named Pipe
Another mechanism, originated in System V, to facilitate interprocess communication; the named pipe presents a FIFO (first-in first-out) element in this communication. The output of one process becomes an input to another process. Named pipes are very useful when a large amount of data is involved in the interprocess communication; sometimes some application, and even OS restrictions could be bypassed by using the named pipe.

UNIX provides the command mknod pipename p to create a named pipe pipename. The same command is used to create special device files and we will return to this command later. The trailing character “p” specifies the named pipe. Pay attention this is slightly different from the usual UNIX way in specifying the command option. In the long listing of a directory the leading letter p identifies named pipes. Again the most probable place for named pipes is the /tmp directory.

Independent of a file type, the file must be mounted before it can be accessed. Mounting is a special UNIX process of bringing online a storage device (primarily a disk) that keeps the files, making the files accessible and their contents readable. Only mounted files become visible and can be searched, found, and processed.

All listed file types have different natures. They are created with file-type specific UNIX commands, but other UNIX commands are mostly applicable on all file types. The output of the same UNIX command can be different depending on the file types, but the command itself would work. For example, the command:

# cat filename

will display the contents of the file filename. But if filename is a symbolic link, the command will display the contents of the linked file. The common bond between all file types is the relationship of the file ownership and the file mode. This relationship is fundamental to all UNIX platforms, and this is one of the main issues that make UNIX so reliable and flexible in the constantly changing environment.

Devices and Special Device Files
A device is a dedicated piece of hardware that provides a particular function within the computer system. A device itself can be located internally or externally. Regardless of the location, devices are treated equally within their classes.

A device driver is a program that manages the system’s interaction with a particular device; it presents a needed interface to translate between the hardware commands understood by the device, and the kernel. Such a system structure keeps UNIX reasonably hardware-independent. Device drivers are parts of the kernel; they are not user processes. However, they can be accessed both from within the kernel and from the user space. User-level access is provided through special device files. The kernel transforms operations on these special files into calls to the driver code.
Special device files are also called device special files. Independent of their naming, these files are really special and different than regular files. Their mission is special in the UNIX paradigm. We will use both names arbitrarily, or even simply special files.
Special device files are mapped to devices via two pointers: major and minor device numbers. These numbers are stored in the inode for a particular special file.
  • The major device number identifies a device driver for a specific class of devices (a single driver can be used for a number of devices of the same type);
  • the minor device number is a parameter within the specified device driver.
Each device driver has routines for performing necessary functions in its interaction with the device. These basic functions are: probe, attach, open, close, read, reset, stop,select, strategy, dump, psize, write, timeout, interrupt processing, and i/o control (ioctl). The addresses of these functions for each driver (independent of the character and block devices) are stored in the jump table inside the kernel. The major device number indexes the jump tables; this is provided through another table known as device switch table. Briefly, the mapping is performed in the following way:
  1. the major device number points to the corresponding entry in the device switch table.
  2. The minor device number is passed as a parameter to the relevant function in the device driver.
  3. The device driver is free to interpret the minor number as it sees fit, although in most cases it uses it as a port number (as is the case when a single driver controls multiple devices of the same type).
  4. As soon as the kernel catches the reference, it looks up the appropriate function name in the driver’s jump table and transfers control to it.
  5. To perform a device-specific operation that does not have a direct analog in the filesystem model (for example, ejecting a floppy disk), the ioctl system call is used to transfer a request directly into the driver.
This treatment of devices in a file-like way is one of the fundamental design elements that make UNIX so powerful. Just as the proven solutions for files’ ownership, mode, access rights, and protection have been implemented in the case of devices, the same has been done with user commands as well. Meanwhile, existing differences in command interpretations were maintained. We will see what this all means in the following example of the copy command:

# cp /path1/filename1 /path2/filename2

This command will copy the contents of the file /path1/filename1 to the file named /path2/filename2, effectively overwriting the file if it already existed, or creating the file if it did not.
However, the command:

# cp /path1/filename1 /dev/console

will copy the file /path1/filename1 to the file /dev/console which is the special file for the physical console terminal. The contents of the file /path1/filename1 will be displayed on
the console screen. As we can see, special files allow I/O operations to be performed with regular interactions among UNIX files.

It is convenient to implement a device driver as an abstraction, even when there is no actual device for it to control. Such devices are known as pseudo-devices; for example, pseudo-TTY (assigned as PTY) is used to communicate with users over a network. From a higher-level software point of view, a pseudo-device looks like a regular device; consequently, preexisting software is transparent, allowing immediate use without the need for any modification.

Special File Names

By convention, special files are kept in the /dev directory. On large systems there may be hundreds of devices, including pseudo-devices.
  • On System V (ATT) flavors, special files are hierarchically organized, with separate subdirectories for different device types: disk, tape, terminal, pseudo-terminal, etc.
  • On BSD platforms, /dev is a flat directory containing all of the special files.
Special file naming is different among different UNIX flavors; however, some common rules are recognized. The following table presents the usual naming algorithms for disk-related special files. Unfortunately, the implemented rules are very restricted and are usually valid only for the specific flavor; naming procedures vary among flavors within the same UNIX platform.

Special File Creation
To create a special file, UNIX provides the mknod command, which has the following syntax:

# mknod filename type major minor

filename A name of the special file to be created
type A type of the special file to be created
c — for a character (row) type special file
b — for a block type special file
p — for a named pipe (FIFO)
major A major device number (decimal or octal)
minor A minor device number (decimal or octal)

Special files are very small and simple files; they contain only two numbers (major and minor number), which are pointers to corresponding device drivers within the kernel. Only the superuser can create a special device file.

Both BSD and System V flavors often include some kind of utility program to create and install special files; usually this is a script based on mknod commands. One such script is makedev that originates from SunOS 4.1.x.
UNIX administrators like script utilities. First these scripts make their jobs easier. But the scripts are also very instructive. We can read them and learn precisely how the utility works and fully understand what happens behind the scenes. We can discover many of the UNIX secrets that are so useful in its daily administration.
Special files are special by nature, but they are dressed like regular files. Several years ago one student raised the questions: “Are the ownership and permissions of special files uniform over all UNIX platforms? Their purposes are the same — is there any regularity? How do you recreate a lost special device file?” Despite the fact that these questions are very logical, there is no simple response. Ownership and mode of special files vary among different UNIX flavors, as do special file names. A very brief review of several UNIX flavors made several years ago easily proved this. Things are not changed nowadays. The ownership and mode of the /dev
directory and reviewed same-purpose special files are presented for several UNIX flavors. It is very easy to conclude that there is no uniformity among different UNIX flavors — naming, ownerships, and file modes are different. What to do if a special file is accidentally lost? Do we have to remember them all? The only logical answer is to search for help within the same UNIX flavor. For example, to look up the same special files on another same-flavor UNIX system (if applicable). Other options are to check vendor documentation, or use other flavor-related sources (call technical support, newsgroups, Internet, etc.).

A process is a single program that is running in its virtual address space. The process should be distinct from a job or a command, which may be composed of many processes working together to perform a specific task. One of the main administrative tasks is to manage UNIX processes. In this section we will cover main process-related topics.

Process Parameters
This is a brief reminder about process parameters. We will start with the process types and main process attributes. Full understanding of process attributes is crucial for certain administrative activities, as well as for the system security. Other discussed issues are file descriptors attached to a process and process states.

Process Types
The three distinct types of processes are:
  • Interactive processes — Interactive processes are initiated and controlled by a terminal session; they run in the foreground attached for the standard input STDIN (in a terminal session STDIN corresponds to the terminal) or in the background. Job control (which originated in BSD) allows a foreground process to be sent to the background and vice versa.
  • Batch processes — Processes not associated with a terminal; these are explicitly submitted to a batch queue and executed with a lower priority in sequential order, primarily at off-peak times. Originally, batch processing was not very thoroughly developed on UNIX platforms, but third-party vendors have improved it. Batch processing is very convenient for non-urgent, long-lasting data processing such as iterative calculations and the like.
  • Daemons — Server background processes, usually initiated at the system boot time, which continue running as long as the system is up. Daemons perform different system-related tasks; they wait in the background until some process requires their service.

Process Attributes
There are many attributes associated with UNIX processes. The following paragraphs discuss the major attributes.
  • Process ID (PID) — The PID is a unique identifying number used to refer to the process. It is an integer assigned by the kernel when the process was created and cannot be changed for the lifetime of the process. Crucial for process handling, a process is always identified by its PID.
  • Parent process ID (PPID) — The PPID is the PID of the parent process, which is the process that was directly involved in the creation of the new process. The PPID is not unique, because the same parent process could have a number of child processes. The PPID cannot be changed during the lifetime of the process.
  • Real and effective user ID (RUID and EUID) — The real user ID (RUID) is the UID of the user who started the process; the effective user ID (EUID) is the UID used to determine the user access rights of the process to system resources (objects). The relationship between the two user ID attributes is: RUID = EUID, except if the SUID access mode was set on the program that created the process, and then EUID corresponds to the owner UID of the program (see also the File Permissions section of the text).
  • Real and effective group ID (RGID and EGID) — The real group ID (RGID) is the GID of the group of the user who started the process; the effective group ID (EGID) is the GID used to determine the group access rights of the process to system resources (objects). The relationship between the two group ID attributes is: RGID = EGID, except if the SGID access mode was set on the program that created the process, and then EGID corresponds to owner GID of the program (see also the File Permissions section of the text).
  • Process group ID (PGID)—The process group ID (PGID) identifies the process group that the process belongs to; typically, multiple processes are members of the same process group and they share the same PGID. The PGID is the PID of the process group leader; this is usually the initial parent process. Unlike PID and PPID, which cannot be changed during the life of the process, PGID is under program control and can be changed by the corresponding system call (as is the case with job control). PGIDs are important in the processing of signals in interprocess communications. For example: the invoked shell is the process group leader for all subsequent commands that are members of the created process group; once the user logs out and terminates the shell, all currently running related processes will also terminate.
  • Control terminal (TTY) — The control terminal is the terminal (or pseudo-terminal) associated with the created process — the terminal that the process was started from.
  • Terminal group ID (TGID) — The terminal group ID (TGID) is the PID of the process group leader that opened the terminal, which is typically the login shell. The TGID identifies the control terminal (TTY) for a process group, i.e., the terminal associated with a process. The TGID is important for job control.
  • Current working directory (CWD) — The current working directory (CWD) defines the starting point for all relatively specified pathnames (filenames that do not begin with the “/” character).
  • Nice number — A number indicating the process priority relative to other processes. Generally, a lower nice number means a higher priority; this is true also when the nice numbers are in the range −20 to +20 (lower number in this case means more negative).

File Descriptors
File descriptors are integers used to identify files that have been attached to a process and opened for I/O. Modern UNIX systems provide more than 20 different files to be opened for a process. File descriptors 0, 1, and 2 are associated with the standard input (a keyboard), standard output (a screen), and a standard error (a screen also), respectively; they are, bydefault, attached to a newly created process. UNIX provides an easy method of I/O redirection by simple replacement of the input, output, and error files. In the case of sockets,
the descriptors are called socket descriptors.

Process States
The existence of a process does not automatically mean it is eligible to receive and consume CPU time. There are multiple process execution states, as discussed in the following text.
  • Runnable — The process is ready to execute whenever there is CPU time available.
  • Sleeping — The process is waiting for a specific event to occur, or for some resource to become available. Interactive processes and daemons spend most of their time sleeping, waiting for terminal input or a network connection.
  • Stopped — The process is suspended and forbidden to run as the result of a received STOP signal; it can be restarted if it receives a CONT signal.
  • Zombie — The process is trying to die; another common term is defunct.
  • Swapped — The process is removed from the system main(RAM) memory to a disk (more precisely, a process image is removed). This occurs when the competition for memory is intense, a lack of available memory for new processes is obvious, and regular memory paging is unable to solve the problem efficiently. Strictly speaking, swapped is not a true process state, because a swapped process can be in one of the previously mentioned states: sleeping, stopped, or even runnable.

Process Life Cycles
Each process is living as long as the corresponding program is running. Process life cycles vary in range from “extremely short” up to “indefinitely” like for daemons (or better to say “as long as the system lives”). Process starts with its creation and lasts until terminated (program exit upon its completion) or forced to quit.

Process Creation
In UNIX a new process is created with the fork system call. An existing process, a parent process, makes a copy of itself into the address space of a child process. From the user’s point of view, the child process is an exact duplicate of the parent process, except for two values: the PID and the parent PID. The fork system call returns the child PID to the parent process and “zero” to the child process (thus, a program can determine whether it is the parent or the child process). The fork system call involves three main steps:
  1. Allocating and initializing a new structure for the child process
  2. Duplicating the context of the parent process for the child process
  3. Scheduling the child process to run
The memory organization and layout associated with a UNIX process contains three memory segments called:
  1. Text segment----A shared read-only segment that includes program code
  2. Data segment---A private read-write segment divided into initialized and uninitialized data parts (the uninitialized part is also known as “block started symbol” (BSS))
  3. Stack segment-- A private read-write segment for system and process related data
There are two modes of the fork operation:
  1. A process makes a copy of itself to handle another task; this is typical for network server daemons.
  2. A process wants to execute another program. Since the only way to create a new process is through the fork operation, the process first makes a copy of itself and then the child process issues an exec system call to execute a new program.
In the later case, the fork is followed shortly thereafter by an exec system call that overlays the address space (text and data segments) of the child process with the contents of the new executable. Such a procedure is also known as fork-and-exec. A new program replaces the contents of the parent process in the address space of the child process but in the same parent’s environment. In this way all global environment variables, standard input/output/error, and priority are kept unchanged.

The ultimate ancestor for every process on a UNIX platform is the process with PID 1, named init and created by the system kernel during the boot procedure. The init process presents a starting point in the chain of process creations; it creates a number of other processes based on fork-and-exec. Among the many created processes are one or more getty processes, assigned to existing terminal lines. Their main duty is to keep the system from unauthorized login attempts; they protect the system from potential intruders, and from the damage they can cause to the system. This is illustrated in Figure 2.2. Different stages of the creation of involved processes are presented, assuming four existing terminal lines.
  1. Four getty processes have been forked-and-exec by the init process. Each getty process is taking care of one terminal line.
  2. Since a user attempts to access the system via a terminal line (more precisely via an attached terminal), getty will exec another program login to supply a login prompt, and to authenticate the user (it will look up the user’s login and password data in the file /etc/passwd); this is shown in the figure for the second terminal line.
  3. Upon login, it checks the user’s password and sets the user ID, group ID, and working directory. It will exec the user’s shell (specified in the user’s password entry in the /etc/passwd file). In the figure this is the case with the third terminal line, and the exec-ed shell is Bourne shell sh.
  4. In the next step, a user executes any command from the shell command line, as the presented ls command on the fourth terminal line. The shell sh forks its copy and then execs the program (command) ls. All presented process IDs are generally specified; however, please note that only fork creates a new child process with a new process ID.

Process Termination
A process terminates either voluntarily through an exit system call, or involuntarily as the result of a received signal. In either case,
  1. termination of a process causes a status code to be returned to its parent process
  2. The process then cleans and closes all process-related resources:
• It cancels any pending timers.
• It releases virtual memory resources.
• It closes open descriptors.
• It handles stopped or traced child processes.
After completing those tasks the process can “die,” i.e., it can be deleted from the kernel process table.

Process Handling
UNIX system administration involves dealing with processes on a regular basis. Monitoring a UNIX system primarily means monitoring running processes. Any change in the configuration usually requires restart of the corresponding daemons. And occasionally a certain process has to be restarted or destroyed. Handling processes is one of the main tasks in maintaining a UNIX system. Every UNIX administrator very quickly becomes familiar with these issues. This is less true for a job control, which is also mentioned at the end of this section. All together, the text that follows is a “good appetizer” — just for the start.

Monitoring Process Activities
Monitoring the processes running on the system is highly recommended; this is the best way to get a good sense of what normal system activity is like:
  • what programs are run
  • how long they run
  • who runs them
  • ... and so on
In addition, when a problem on a system is encountered, the first step to figure out what the problem could be is to check the status of running processes. You can discover a lot from a simple cross-view of the status of the processes running on your system at a certain time. Such a routine procedure is also very important for system security, because any unusual system activity can be noticed and quickly stopped.

The UNIX ps (process status) command lists the characteristics of running processes; the format of the command is:

# ps [options]

Basic options are explained in the following text. Unfortunately, there are certain differences in command options between the two main UNIX platforms, BSD and System V.

BSD Flavored ps Command The ps command displays the status of currently running processes; without any options specified only the processes that are running with the effective user’s ID and those that are attached to a controlling terminal are shown. Additional categories of processes can be added to the display using certain options:
  • -a option Includes processes that are not owned by the user who issues the command itself; displays all processes attached to the control terminal
  • -x option Includes processes without control terminals; when both -a and -x are specified, ps displays processes owned by anyone, with or without a control terminal
  • -r option Restricts the list of displayed processes to the running processes: runnable processes, those in page wait, or those in short-term non-interruptible waits
  • -l option Displays a long listing with many additional fields; gives a full picture of each displayed process
  • -u option Displays a user-oriented listing with additional user-related fields
In its standard format, ps displays:
  • The process ID, in the PID column
  • The control terminal (if any), in the TT column
  • The CPU time used by the process so far, including both user and system time, in the TIME column
  • The state of the process, in the STAT column
  • An indication of the COMMAND that is running
Here is an example:

$ ps -ax
0      ?   D   0:07 swapper
1      ?   IW  0:00 /sbin/init -
2      ?   D   0:00 pagedaemon
2087   p1  S   0:00 -csh (csh)
2091   p1  R   0:00 ps -ax
1996   p2  IW  0:00 -sh (csh)

The long listing (option -l) and the user-oriented (option -u) formats are different, as seen in the following examples (only the first six lines in the listing are displayed):

# ps -aux | head -6
bjl    2905    30.8   3.3 228   476 p1  R   09:29 0:00 ps -aux
bjl    2906     7.7   1.4  40   200 p1  S   09:29 0:00 head -6
root      2     0.0   0.0   0     0 ?   D   May16 0:00 pagedaemon
bald   2499     0.0   0.0  36     0 co  IW  May23 6:23 telnet rs01-ch
root     85     0.0 352     0.0   0 ?   IW  May16 0:36 in.named

# ps -alx | head -6
80003      0    0    0    0 -25 0    0  0   runout D  ?  0:41 swapper
20088000   0    1    0    0   5 0   52  0   child  IW  ? 0:00 /sbin/init -
80003      0    2    0    0 -24 0    0  0   child  D  ?  0:00 pagedaemon
88000      0   54    1    0   1 0   68  0   select IW  ? 0:29 portmap
88000      0   59    1    0   1 0  120  0   select IW  ? 5:40 ypserv

The meaning of the columns in the listings is given below; the letters “u” and “l” indicate the options user and long; “all” stands for both.

The most common format of the BSD-flavored ps command is:

# ps -aux

The output of this command is an extensive listing of process-related data sufficient for most administrative needs.

System V (AT&T) Flavored ps Command The ps command displays the status of currently running processes; without any options, only the processes associated with the current terminal are displayed. The basic options are:
  • -e option Displays all processes
  • -f option Produces a full listing, including the process start time
  • -l option Displays a long listing with many additional fields
The regular output of this command is a so-called “short” listing (as opposed to the full or long listing). A short listing contains only the user and process IDs (including parent process ID), terminal identifier, start and cumulative execution time, and the command name. An example of the short listing for all processes follows:

$ ps -e
root      0   0  0 Dec 31    ?  0:05   swapper
root      1   0  0 11:23:17  ?  0:00   init
root       2    0  0 11:23:16   ?   0:00 vhand
dubey 1550   1549  0 08:40:13 ttys0 0:00 -sh
bjl   1618   1591 10 09:25:59 ttys1 0:00 ps -ef

A full or long listing displays many additional pieces of information:

$ ps -ef | head -6
3 S   root  0     0    0 128 20 1e0568     0        Dec 31  ?  0:06 swapper
1 S   root  1     0    0 168 20 2056540 54 7ffe6000 May 16  ?  0:00 init
3 S   root  2     0    0 128 20 2056480  0 1ee3d0   May 16  ?  0:01 vhand
3 S   root  3     0    0 128 20 20564c0  0 1ec4d4   May 16  ?  0:00 statdaemon
3 S   root  7     0    0 128 20 2056500  0 1e8dc0   May 16  ?  0:00 unhash-daemon

$ ps -l | head -5
1 S   201    9444 9443 0 158 20 2151100 52 350c1c ttys1  0:00 sh
1 S     0    9443  106 0 154 20 2151a40 17 221728 ttys1  0:00 telnetd
1 R   201    9473 9472 7 179 20 20d7f40 17        ttys1  0:00 ps
1 S   201    9472 9444 4 154 20 2151680  6 3300e4 ttys1  0:00 head

The column headings and the meaning of the columns in a ps listing are given below; the letters “f”and “l” indicate the option (full or long) that causes the corresponding heading to appear; “all” means that the heading always appears. Note that these two options determine only which information would be displayed for a process; they do not determine the processes to be listed.

The most common format of the System V flavored ps command is:

# ps -ef

The full listing provides all the process-related data we need for a successful administration.

Destroying Processes
The UNIX kill command will eliminate a process entirely:

kill [-signal] pid

where (A signal is optional)
  • signal Signal to be sent to the process (default: signal #15 = TERM)
  • pid Process identification number (PID)
BSD allows the user to specify either the signal number or its symbolic name. System V requires the signal to be specified numerically
The signal #9 (KILL) guarantees that the process will be destroyed.
When a process is killed, it informs its parent process of its imminent termination (death), and waits for the parent’s acknowledgment. After receiving acknowledgment, the PID of the killed process is removed from the process table.
Normally, the default kill command is used to terminate a process without the specified signal that corresponds to the signal #15 (TERM); such a command is also known as a soft kill.
Upon receipt of the TERM signal, the process should exit in a normal way by closing all the resources it is using. Occasionally, a process may still exist after a soft kill command. If this occurs, another so-called hard kill has to be applied. By executing the kill command with the signal #9 (KILL signal), a process is forced to exit. However, this kind of process termination is not good for the system because some system resources may remain unclosed and still busy. A hard kill should be used only as a last resort in attempting to terminate a process.
Processes will not terminate (die) even after being sent the KILL signal if they fall in one of the following three categories:
  1. Zombies — A process in the zombie state (presented as Z status or defunct in ps display) is one in which all of the process’s resources have been freed, but the parent process’s acknowledgment has not occurred. Zombies are always cleared when the system is booted and do not affect system performance.
  2. Processes waiting for unavailable NFS resources — In such a case, a kill command with signal #3 (QUIT) or #2 (INT) should be used.
  3. Processes waiting for a device to complete an operation — For example, waiting for a tape to finish rewinding.
Killing a process also kills all of its child processes that share the same process group. For example, killing a shell usually kills all the foreground and stopped background processes initiated from that shell, including other invoked shells. Killing a login shell is equivalent to logging the user out. It is common for children and parents to belong to the same process group, but this is not necessarily always true (see Job Control at the end of this section).

Although the name kill indicates that the command should destroy a process, its real effect depends on the selected signal that is sent to the process. Sometimes the command does not destroy a process at all, and it can even do the opposite. For example, by sending the signal CONT to a previously stopped process, the process will continue to run; you would not think a “killed” process could be “revived.” In that light, a more appropriate name for the command could be “send signal,” because it better describes what the command is really doing.

The -l option is available to display a list of signal names:

$ kill -l  (SunOS Solaris)

$ kill -l   (HP-UX)

$ kill -l  (Linux)

As we can see, the order of listed signal names is not necessarily the same. Fortunately, the most important and most often-used signals match. The list of signals with descriptions follows.

Job Control

A job is a collection of one or more processes that share the same process group ID. Job control is a feature that allows multiple processes to start from a single terminal, and also allows some control over their execution. Job control requires support from
  • the terminal driver
  • the signal mechanism
  • the used shell, and
  • the underlying operating system.
Job control allows the user to have multiple jobs sharing a single terminal, to move jobs from foreground to background and vice versa, to suspend and restart jobs, and to perform other miscellaneous activities. A job control-compatible shell makes each child process sent to the background a leader of its own process group. In this way, it makes a child process insensitive to signals sent to the parent shell (recall that signals have an effect on all processes within the same process group). One of the consequences is, for example, that all background processes remain alive upon the termination of the shell (when the user logs out).

There are several job-related UNIX commands, i.e., jobs, fg, bg, which are quite comprehensive and easy to use. They are primarily user oriented, although they can play a role in UNIX administration, too.

UNIX Administration Starters

Superuser and Users
The central entity in UNIX is a file — every activity on the system represents some kind of transaction with or between files. Consequently, administrators of UNIX systems are expected to deal with files, including the special purpose files known as configuration files. Configuring system functions, setting some system parameters, tuning a kernel, and restoring a lost file, all require the appropriate access to the needed data within the file. On the other side, system files always require privileged access. In practice, this means that the administrator has to be a superuser on the system in order to effectively administer the UNIX system.

Becoming a Superuser
On a UNIX platform, the superuser is a privileged user with unrestricted access to all files and commands. The name of this user account is root; the account is protected with a password as with any other user account. There are two ways to become the superuser:
  1. Log in directly as root. This is always possible from the system console; it is recommended that you disable the direct root log-in from other terminals as a security precaution, but this is not a requirement.
  2. Switch from another user log-in account to the superuser’s account by executing the su command.
In both cases the system will prompt for the root password. After entering the correct password, the superuser is logged into the system and has full control over all its resources. The root account is extremely sensitive; one wrong move can easily destroy important files and crash the system itself. Only knowledgeable persons should enjoy superuser status; it is very important to restrict root access only to a certain group of people who are responsible for the system itself. Obviously UNIX administrators should belong to this group.

Communicating with Other Users
The UNIX administrator frequently needs to communicate with other users, mostly to inform them of current administrative activities being performed on the system. Some examples include instructing all logged-in users to close their files and logout on time when a system is going to be shut down informing users when new software is installed, or passing along any other information important for regular system operations. Several UNIX commands are available for this purpose:
  • Sending a message to the user:
    write username [tty]

username User to whom the message is sent
[tty] Optional terminal if the user is logged in to more than one

The text of the message should be typed after the command is issued; typing Ctrl-D (^D) terminates the command. Once the message is terminated, the shell returns the command prompt. The typed text of the message will be displayed at the terminal screen of the addressed user.
  • Sending a message to all users
    wall (stands for “write all”)

The text of the message should be typed after the command was issued; typing Ctrl-D (^D) terminates the command. The typed text of the message will be displayed at the terminals of all logged-in users.
  • Sending the message of the day
The message of the day — “motd” — can be used to broadcast systemwide information to all users. The file /etc/motd keeps an arbitrary message which will be displayed during any user’s log-in procedure. Log-in is probably the most convenient time to catch the user’s attention, because the user is fully concentrated on the output of the log-in procedure. That makes it an ideal time to inform users about changes in the system, newly installed software, and so on.
Any editor can be used to edit the /etc/motd file; the default UNIX editor is “vi.”
  • Sending e-mail to user(s)
E-mail is a convenient vehicle for communicating non-urgent or lengthy messages to users. E-mail is especially convenient for informing users about automated jobs because it is very easy, for example, to send a message about the status of an executed job to the users from the script that ordered the execution.

The su Command
We already mentioned the su command when we discussed how to become the superuser. But the su commands does more; su allows an already logged-in user to become another user without logging out. The format of the su command is:

su [ - ] [username [ arg...]]

  • - (dash) Must be specified as the first option when the environment for the specified user is passed along unchanged, as if this user actually logged in. Otherwise, the environment is passed along with the exception of certain environment variables. Please note the differences to avoid any possible confusion regarding the new user environment.
  • username Specifies the name of the new user to whom to switch; the default user name is root. Without a specified user name, the command will try to switch to the superuser.
  • arg... One or more optional arguments to be passed to the new shell; an arg of the form “-c cmd_string” executes the command string using the shell; an arg of “-r” gives the user a restricted shell.
The su command requires the user to supply the appropriate password unless a switch from the root to another user account is performed. If the password is correct, su creates a new shell process with the characteristics of the specified user (RUID, EUID, RGID,EGID, and supplementary groups). The new shell will be the shell specified in the username’s passwd entry; otherwise the default Bourne shell sh will be invoked. To return to the initial user’s account, type exit, or Ctrl-D (^D) to exit the new shell. All attempts to become su are logged in the log file /var/adm/sulog. A few examples follow:
  • To become user bjl while retaining the previously exported environment, execute:
    $ su bjl

  • To become user bjl but also change the environment as if bjl had originally logged in, execute:
    $ su - bjl

  • To execute commands with the temporary environment and permissions of user bjl, type:
    $ su - bjl -c command args

UNIX Online Documentation

The man Command

UNIX has integrated online documentation, which is available to all users and UNIX administrators. It is very hard to imagine successful administration without the extensive online help provided by the UNIX manual pages. Every command, every option, all system calls, and many other details are fully documented and available whenever you need them, and they are always flavor-specific and accurate. The basic online version of the UNIX reference manuals is usually located under the manual page directory /usr/man, with possible additional topics located in the other “man” directories /dirpath/man. The environment variable $MANPATH should include all “man” directories in a complete search of the selected manual page title; otherwise, the system will not be able to find and display the required manual pages.

UNIX manual pages are divided into a number of sections, each containing similar topics. The basic section organization is presented in the following table:

Modern UNIX flavors introduced new sections that were usually appended to the existing ones. It is entirely possible for the manual pages to be organized somewhat differently on your UNIX system. Sections reside in separate subdirectories beneath the initial “man” directory. Here is an example from the Solaris 2.x platform:

$ ls -F /usr/man
cat-w/    man1f/ man3c/ man3r/  man4/  man7fs/ man9f/
cat./     man1m/ man3e/ man3s/  man4b/ man7i/  man9s/    man1s/ man3g/ man3t/  man5/  man7m/  manl/
man1/     man2/  man3k/ man3x/  man6/  man7p/  mann/
man1b/    man3/  man3m/ man3xc/ man7/  man9/   windex
man1c/    man3b/ man3n/ man3xn/ man7d  man9e/

The UNIX man command is available to display specific manual pages. The command has several options, but its basic format is:

# man man_page_title

  • man_page_title A title we are looking for. If the specified title does not exist, or if it is spelled incorrectly, the system informs us; otherwise the required manual pages will be displayed, page by page.
The general format of the displayed manual pages includes the following paragraphs, if applicable:

Linux provides even more; besides this, for UNIX standard online documentation, Linux also offers Texinfo Manual, which presents more detailed technical descriptions of related topics. Again its use is very simple; by typing “info topic-name” the required information about the specified topic is displayed.

The whatis Database
The man command is very useful for getting information on a specific title; a title could be a command name, system call, library item, or something similar, but an existing title must always be specified. If such a title is unknown and you are searching for the manual pages related to a topic (but that topic is not the title itself), the whatis database has been provided.

NIX allows you to build the whatis database, which is instrumental in finding information about a certain topic without knowing the relevant manual page title. The whatis database contains all of the manual page titles with a brief description of them; it primarily resides in the /usr/man/windex file (sometimes the file name is whatis), but also in other additional database files in the corresponding “man” directory. The command “man -k topic_item” will search through the whatis database and display all manual page titles that refer to the specified “topic_item.” Once the relevant title is known, the corresponding manual pages can be displayed. For a better understanding, see the -k option in the manual pages for the man command.

The whatis database must first be created locally; copying a database from another system does not work because the database must be directly linked with existing manual pages on the system where it resides. Additionally, the database should always be recreated when new manual pages are added to the system; the database must integrate the newly available titles.

The UNIX command catman-w is available to create a whatis database. It is very easy to begin to create a database, but it takes quite a while for the process to finish. It is a good idea to create a whatis database immediately upon UNIX installation. Some UNIX flavors introduced new commands to create the whatis database. In Linux, the whatis and apropos commands are available (they have almost the same appearance as “man -k”), and the command makewhatis to create the whatis database.

System Information
UNIX administration means administering UNIX software or, more precisely, UNIX system software. Software requires maintenance just like any other product; but because of their complexity, software systems require a more sophisticated level of maintenance. Among the increased requirements are highly educated and skilled personnel who are capable of managing, upgrading, configuring, and fixing unpredictable and very sophisticated problems.
Software could not exist without the corresponding computer hardware. Knowledge of hardware can be very instrumental and helpful in UNIX system administration. At the very least, a UNIX administrator has to be familiar with basic system hardware configuration.
In the following text, several UNIX commands of this nature will be discussed.

System Status Information
To begin, let us introduce a few commands useful for checking the system status.

The uname Command
The uname command prints the basic UNIX system information to the standard output file. The displayed system data contain: hostname, operating system data, and hardware architecture data.
The format of the command is:

uname [ options ]

where the available options are:
  • -n Print the hostname (the hostname may be the name by which the system is known to a communications network)
  • -s Print the operating system name (default)
  • -r Print the operating system release
  • -v Print the operating system version
  • -m Print the machine hardware name (architecture)
  • -a Print all the above information
The output of the uname -a command for several UNIX flavors is presented in the following table:

Supposing a default system startup, Linux offers more detailed information about OS in the file /etc/issue. By typing:

$> cat /etc/issue
Red Hat Linux release 7.0 (Guinness) Kernel 2.2.16 on a 4-processor i686

we will definitely learn more about our Linux installation.

The uptime Command
The uptime command displays:
  • The current time
  • How long the system has been up (the length of time)
  • Number of users
  • A rough estimate of the system load over the last estimate, every 5 and 15 minutes
Here are a few examples:

# uptime
6:47am up 6 days, 16:38, 1 user, load average: 0.69, 0.28, 0.17   (Solaris)
9:50am up 9 days, 34 min, 3 users, load average: 0.00, 0.00, 0.00 (SunOS)
9:38am up 9 days, 27 min, 1 user, load average: 2.07, 2.03, 2.03 (HP-UX)

The dmesg Command
The dmesg command collects system diagnostic messages; it looks in a system buffer forrecently generated messages when errors occur and forwards them to the standard output.
When the “-” option is used, the dmesg command incrementally generates messages that are new since the last time it was executed.

Sometimes, existing imperfections can stay hidden and the system appears to be working fine; in such cases the dmesg command could be very useful. However, the system error message buffer is of a small, finite size, so there is no guarantee that all error messages will be logged.

In the past, the dmesg command was also used to update the system log file (usually /usr/
adm/messages) by its periodic execution through the cron facility. A typical crontab entry:
/etc/dmesg - >> /usr/adm/messages
would update the system log file periodically. Today, such a task is obsolete, and an update of the system log file is performed by the syslogd daemon.

An example follows (from the HP-UX platform):

$ dmesg
May 20 16:59
Floating point coprocessor configured and enabled.
I/O System Configuration:
Block TLB entry #8 from 0 × f5000000 to 0 × f5ffffff allocated.
HPA1991AC19 Bit-Mapped Display (revision 8.02/10) in SGC slot 0
SGC at select code 0 × 0
Built-In SCSI Single-Ended Interface at select code 0 × 20: function number 1
Built-In LAN controller found at select code 0 × 20: function number 2
HIL interface at select code 0 × 20: function number 3
Built-In RS-232C Serial Interface at select code 0 × 20: function number 4
Built-In RS-232C Serial Interface at select code 0 × 20: function number 5
Parallel port at select code 0 × 20: function number 6
Advanced Digital Audio Interface at select code 0 × 20: function number 8
System Console is on the ITE
Networking memory for fragment reassembly is restricted to 2957312 bytes
Swap device table: (start & size given in 512-byte blocks) entry
0 - auto-configured on root device; start = 869400, size = 152702
Core image of 8192 pages will be saved at: block 478283 on device 0 × 7201600
Warning: filesystem time later than time-of-day register
Getting time from filesystem
B2352A HP-UX (A.09.03.nodebug) #1: Mon Aug 30 21:05:26 MDT 1993
Memory Information:
Physical: 32768 Kbytes, lockable: 26168 Kbytes, available: 27880 Kbytes
Copyright (c) 1990–1998, Rational Software Corporation.
Covered by U.S. patent no. 5,574,898.
Other U.S. and foreign patents pending.
automountd not running, retrying
automountd OK

Hardware Information
It is logical to want to upgrade your UNIX system to improve its overall performance. The first thing you need to know is the current hardware configuration of the UNIX system:
  • how many CPUs are installed?
  • How much memory is used?
  • What is the size of the disk space?
These simple questions are very common, and the UNIX administrator always addresses them.

A partial answer can be obtained with the UNIX command top. The top command lists the top-most CPU-consuming processes. The command is extremely instrumental in performance measurement and the tracing of potential problems. However, the command also displays basic data about the number of CPUs and memory usage, which is what we are looking for right now. An example follows:

# top
System: mekong                                    Mon Jul 17 22:51:28 2000
Load averages: 0.91, 0.77, 0.75
199 processes: 197 sleeping, 2 running
CPU states:
CPU LOAD             USER        NICE      SYS    IDLE BLOCK          SWAIT      INTR     SSYS
0        0.83          1.0%       0.0%     1.4% 97.6% 0.0%             0.0%      0.0%     0.0%
1        0.99        75.2%        0.0%    24.8%    0.0% 0.0%           0.0%      0.0%     0.0%
—         —            —           —        —       —       —           —         —        —
avg      0.91        38.0%        0.0%    13.1% 48.8% 0.0%             0.0%      0.0%     0.0%
Memory: 49676K (40972K) real, 100316K (83172K) virtual, 196720K free Page# 1/19
CPU    TTY      PID    USER     PRI    NI    SIZE    RES   STATE TIME       %WCPU %CPU        COMMAND
1       q2    27047     cbw1    239    20   4740K    968K   run     173:59   99.09    98.92     udt
0         ?      398    root    154    20    108K    140K   sleep 1324:09     0.93     0.93     syncer
0         ?    7448     rpsc    168    20   4484K    696K   sleep    35:57    0.89     0.89     udt
0       p1     8405     root    178    20   1260K    340K   run       0:00    0.85     0.49     top
0         ?    6948     root    155     2   6288K   6340K   sleep    28:49    0.41     0.41     lcp

It is also a good idea to try using the available system administration tools, like the HP-UX flavored SAM, or AIX flavored SMIT. These always provide hardware-related information among their many other menu selections. They are very well suited to this purpose, because a search for hardware information is almost always interactive.

Otherwise, each UNIX flavor provides a different set of commands used to diagnose the installed hardware. We will discuss some of them (SEE ONLY WHAT IS USEFUL FOR YOU).

The HP-UX ioscan Command
On the HP-UX platform, the special command ioscan is available for dealing with actual hardware. The command scans system hardware, usable I/O system devices, or kernel I/O system data structures, as appropriate, and lists the results. For each hardware module on the system, ioscan displays (by default) the hardware path to the hardware module, the class of the hardware module, and a brief description of it. By default, the ioscan command scans the system and lists all reportable hardware found. The types of hardware reported include processors, memory, interface cards, and I/O devices. Entities that cannot be scanned are not listed.

The ioscan command recognizes the following options:
  • -C class Restricts the output listing to those devices belonging to the specified class
  • -d driver Restricts the output listing to those devices controlled by the specified driver
  • -f Generates a full listing, displaying the module’s class, instance number, hardware path, driver, software state, hardware type, and a brief description
  • -F Produces a compact listing of fields separated by colons
  • -H hw_path Restricts the scan and output listing to those devices connected at the specified hardware path
  • -I instance Restricts the scan and output listing to the specified instance
  • -k Scans kernel I/O system data structures instead of the actual hardware and lists the results
  • -n Lists device file names in the output; only special files in the /dev directory and its subdirectories are listed
  • -u Scans and list usable I/O system devices instead of the actual hardware. Usable I/O devices are those having a driver in the kernel and an asigned instance number.
Some of the options require additional arguments, known as fields, which are defined as follows:
  • class A device category, for example: disk, printer, or tape
  • instance The instance number associated with the device or card; it is a unique number assigned to a card or device within a class
  • hw_path A numerical string of hardware components, noted sequentially from the bus address to the device address; typically, the initial number is appended by slash (“/”), to represent a bus converter (if required by the machine), and subsequent numbers are separated by periods (”.”). Each number represents the location of a hardware component on the path to the device.
  • driver The name of the driver that controls the hardware component
The following example shows a partial output of the ioscan command:

# /usr/sbin/ioscan
H/W Path             Class             Description
8                  bc      I/O Adapter
10                 bc      I/O Adapter
10/0               ext_bus GSC built-in Fast/Wide SCSI Interface
10/0.5             target
10/0.5.0           disk    SEAGATE ST15150W
10/0.6             target
10/0.6.0           disk    SEAGATE ST15150W
10/0.7             target
10/0.7.0           ctl     Initiator
10/4               bc      Bus Converter
10/4/0             tty     MUX
10/4/12            ext_bus HP 28696A-Wide SCSI ID = 7
10/4/12.12         target
10/4/12.12.0       disk    SEAGATE ST32550W
10/12/5.0          target
10/12/5.0.0        tape    HP C1533A
10/12/5.2          target
10/12/5.2.0        disk    TOSHIBA CD-ROM XM-5401TA
10/12/5.7          target
10/12/5.7.0        ctl     Initiator
10/12/6 lan       Built-in LAN
10/12/7 ps2       Built-in Keyboard/Mouse
32      processor processor
34      processor processor
49      memory    Memory

The Solaris prtconf Command
On the Solaris platform, the prtconf command displays the system configuration information.
The output includes the total amount of memory and the configuration of system peripherals formatted as a device tree. The prtconf command has several options:
  • -P Includes information about pseudo devices; by default, information regarding pseudo devices is omitted
  • -v Specifies verbose mode
  • -F Returns the device pathname of the console frame buffer, if one exists. If there is no frame buffer, prtconf returns a non-zero exit code
  • -p Displays information derived from the device tree provided by the firmware (PROM)
  • -V Display platform-dependent information
  • -D For each system peripheral in the device tree, displays the name of the device driver used to manage the peripheral
The following example presents a partial output of the command running on a Sun4/65 series machine:

# /usr/sbin/prtconf
System configuration: Sun Microsystems sun4c
Memory size: 16 megabytes
System peripherals (software nodes):
Sun 4_65
options, instance #0
zs, instance #0
zs, instance #1
fd (driver not attached)
audio (driver not attached)
sbus, instance #0
dma, instance #0
esp, instance #0
sd (driver not attached)
st (driver not attached)
sd, instance #0
sd, instance #1 (driver not attached)
le, instance #0
cgsix (driver not attached)
auxiliary-io (driver not attached)
interrupt-enable (driver not attached)
memory-error (driver not attached)
counter-timer (driver not attached)
eeprom (driver not attached)
pseudo, instance #0

The output of the prtconf command is highly dependent upon the version of the PROM installed in the system. The output will be affected in potentially all circumstances. The “driver not attached” message means that no driver is currently attached to that specific device. In general, drivers are loaded and installed (and attached to hardware instances) on demand and when needed, and may be uninstalled and unloaded when the device is not in use.

The Solaris sysdef Command
Another Solaris command that can be used for this purpose is sysdef. The sysdef command outputs the current system definition in tabular form. It lists all hardware devices, as well as pseudo devices, system devices, loadable modules, and the values of selected kernel tunable parameters. It generates the output by analyzing the named bootable operating system file (namelist) and extracting the configuration information from it. The default system namelist is /dev/kmem. However, the command output is not entirely comprehensive for figuring out basic hardware information; it is more suitable for kernel-related information. This command should probably not be the first choice.

Personal Documentation
UNIX administration is a challenging job; it requires a substantial level of expertise and skills. But UNIX administration is also a routine job, in which the tasks can only be successfully accomplished by following the required procedures. To install UNIX, you must follow the vendor’s instructions and recommendations; to configure an application you must strictly obey configuration rules. There is no room for improvisation; improper settings are the main causes of system instability and all related problems. Bugs in the software are a good excuse for our wrongdoings, but only rarely are they the real cause of the problems we experience. Properly configuring a system, and ensuring all of its settings are correct, is not an easy task. Often there are plenty of small but important details that we must take care of. It is easy to forget these small issues, especially if we only deal with them occasionally. Taking notes on everything done to the system can be very instrumental for future work; such notes can be the lifesaver in some critical situations. These moments are always very stressful, and an administrator has to act quickly and accurately. There is no better advice for that time than to follow your own, already tested and proven notes.

Many administrative tasks repeat a number of times; it is common to install the same UNIX version on different machines, to configure hosts in the same network environment, to set the same application software multiple times, etc. Any notes about jobs you have done previously can be very helpful; the length of time between jobs can be large enough that you may forget many important details. Note by note a substantial personal documentation will be built; this is your “knowledge database,” and it is very important for efficient work. You will always be more familiar with your own documents than with any vendor-provided documentation. There is no need to worry about style, syntax, or language — as long as they are explicit and complete, you will always understand your own texts.

A key issue for successful UNIX administration is to be well organized. System administration is based on rules designed by others: different configuration files have different formats and syntax. Each required letter, number, dot, dash, or whatever is specified must be fully respected — there is not a great deal of freedom of choice. A UNIX administrator cannot invent another set of configuration rules, even if the existing ones do not seem very logical or convenient. It simply will not work. Past experiences can save time and make everything easier; copying a workable procedure is definitely more efficient than reinvestigating something you have already done.

In most cases, UNIX administration is also a team task. It takes a number of UNIX administrators (as well as others such as NT administrators, network administrators, helpdesk staffers, etc.) to support large company networks. One important issue, then, is how to make their collective work more efficient. One logical solution is to combine all individual documentation and then make all of this documentation available to all team members. The organization of this effort, however, is crucial. A very efficient approach to making all system documentation available yet well organized is to put individual personal documents on the company network, creating substantial internal company site-specific documentation, and make the documentation available to all relevant associates. By posting these documents on an internal company Web site (if necessary even creating an internal Web site for this purpose), everyone will be able to obtain the necessary information about any described topic. The documentation remains open for any required update or upgrade. To prevent potential frauds, the access to documents should be restricted to administrative personnel only. There are third-party products that provide tools to create internal knowledge databases; in most cases they offer other features, as well. However, they can be costly and sometimes too complex to work with. Creating your own internal, Web-based documentation site is simple, inexpensive, and very efficient.

Shell Script Programming
Shell programming is one of the strongest parts of the UNIX administration. This is also one of the key elements of an overall UNIX success. UNIX administrators are in love with shell programming. Where is this authoritative statement coming from? It is coming from the fact that the shell programming presents an extremely powerful tool to customize and automate your UNIX system, as well as to accomplish many manual administrative activities easier. An intuitive and colorful graphic user interface (GUI) sounds challenging for certain complex administrative actions. However, GUI actions remain quite hidden from us. GUI is great as long as everything is going smoothly, but very frustrating once it starts to fail. And what do you do when GUI is not even running because of underlying problems? Or, how do you automate some repeated actions? Even to document needed steps in the GUI administration is not an easy task.

A good UNIX administrator tends to pack needed administrative actions into the corresponding shell scripts, and then to use the scripts instead. Well-written and tested shell scripts are always working properly, even in the most critical situations when the pressure on the UNIX administrator is always very high. There are no typos and mistyping in the shell-script implementations nor are there incorrect command options — frequent errors during manual procedures. Everything is happening correctly and in the fastest possible way. Simply, shell scripts are lifesavers. There are also many other reasons in favor of the intensive shell programming. Time-scheduled scripts will execute successfully the same job as many times as needed, with or without provided verbose logging, e-mailing, paging, or whatever is required. We should spend the time only once, when we write the script, and only to use the script later. And always when we write a script, we should have enough time, and be doing it far from any of the pressure typical of urgent administrative actions.

Shell programming is a prerequisite for good UNIX administration. It is assumed that a UNIX administrator is familiar with shell programming. This section is not a tutorial in shell programming. Rather it points to certain aspects of shell programming that could be confusing for UNIX administrators (even if not beginners in this area). A thorough shell-programming tutorial is definitely not in our scope here; however, these skills are assumed as knowed.

UNIX User Shell
UNIX user shell is an interface layer between the UNIX operating system and the user. It is presented in the Figure 3.1. There are many different UNIX shell flavors:
  • Bourne shell sh,
  • Korn shell ksh,
  • C shell csh,
  • Bourne again shell bash,
  • enhanced C shell tcsh, etc.
Some shells are very similar — like ksh and bash, sh is the subset of ksh — but generally they are not mutually compatible (at least in both directions). This is important to know when a shell script is invoked.

UNIX Shell Scripts
Shell scripts are programs that comply with the shell programming language. Shell scripts are not compiled programs; instead they are readable text files where each command line is read and processed by the shell command interpreter at the time the script is executed. Shell command interpreter processes a shell script until an erroneous command line is encountered or until it ends. A shell command line can contain:
  • Any UNIX command or command sequence
  • Any shell-flavored command or statement
  • Any other program or shell script
  • A combination of previously listed items
Each shell has a number of its own commands and statements that actually make shell programming so powerful. Make sure that they are very shell-specific in every sense: syntax and action.

Shell Script Execution
A shell script (as any other program in UNIX) can be simply invoked by its name, but the read and execute permissions for the script are required. The following example illustrates this:
sh# cat / (to see content)
echo “Just a test of x permission”

sh# ls -l /tmp/ (to see permissions)
-rw-r--r-- 1 root root 39 Aug 21 18:27/tmp/

sh# /tmp/ (to invoke shell script)
sh: /tmp/ Permission denied
The script can also be invoked with an explicitly specified shell. In that case the executepermission on the script is not mandatory. Some UNIX flavors will execute a shell script even without read permission granted.
sh# /bin/sh /tmp/
Just a test of x permission
When invoked directly, the shell script is executed in the environment of the current user shell. The current user shell is forked, and then each command line of the shell script is processed by the shell interpreter and executed (already discussed fork-and-exec start of the program). If two shell flavors do not match (the shell script and the parent shell —for example bash script is invoked in csh environment), most probably a number of errors will be encountered for basically correct shell script.

The following examples present such situations. The arbitrary bash script named myscript.bash is invoked in the bash and csh environment:
bash# cat /tmp/myscript.bash
# Define variables
export TEXT1 = “This is a bash script myscript.bash”
export TEXT2=“Running the script myscript.bash”
# Run the command
echo “$TEXT1”
echo “$TEXT2”

bash# /tmp/myscript.bash
This is a bash script myscript.bash
Running the script myscript.bash

bash# /bin/csh (Switch to csh)
csh# /tmp/myscript.bash
export: Command not found.
export: Command not found.
TEXT1: Undefined variable.
The previous problematic situation could be skipped in two ways. First, as we mentioned previously, the script can be invoked with explicitly specified shell:

bash# /bin/bash /tmp/myscript.bash (Here shells match)
This is a bash script myscript.bash
Running the script myscript.bash

csh# /bin/bash /tmp/myscript.bash (Here shells don’t match)
This is a bash script myscript.bash
Running the script myscript.bash
Or the shell can be implicitly specified in the script itself. The very first line in the script of the format — #!/bin/shellname — has a special meaning. The “/bin/shellname” identifies the full path of the desired shell, which will be invoked first and then the script executed in this shell environment. Remember that it can be any other executable program, not necessarily the shell. However, we are assuming a shell. Here are examples:

bash# cat /tmp/myscript1.bash
#!/ bin/bash
# Define variables
export TEXT1=“This is a bash script myscript1.bash”
export TEXT2=“Running the script myscript1.bash”
# Run the command
echo “$TEXT1”
echo “$TEXT2”

bash# /tmp/myscript1.bash
This is a bash script myscript1.bash
Running the script myscript1.bash

csh# /tmp/myscript1.bash
This is a bash script myscript1.bash
Running the script myscript1.bash
In all the examples, the current shell spawns itself or another shell, making a “parent–child relationship” between two shells (current user’s shell and the invoked shell script). However, a shell script can also be executed directly in the user’s shell environment. For this purpose the shell script must be “sourced.” A special shell command is used to source the script.
source # for csh and csh-like shells
. # for ksh, bash, and Bourne shells
To source a shell script means to skip the forking of the user’s shell and to execute the script directly in the user’s shell environment.

Shell Variables
We can define and redefine shell environment within the shell script. By invoking a new shell script, the current shell environment is transferred and the new initial shell environment created. Remember that this is a unidirectional transfer, from parent toward child shell (child inherits the parent’s environment); the reverse is never possible. Regarding shell variables, only global, i.e., exported, variables could be inherited; local variables remain always within the current shell environment, and they disappear once the shell is terminated. This sometimes sounds very confusing for the novices in UNIX administration. In this light we can better understand the need and purpose of the shell command: source. If we want to define a shell environment within a single script (let us call it environment definition script), and then share these definitions among many other shell scripts, we must source the environment definition script. Otherwise, all definitions will last as long as the execution of the environment definition script. The following example illustrates that situation. The user’s shell is Bourne shell. Variables VARA and VARB are not defined.
sh# echo $VARA # To check if $VARA is defined
sh# echo $VARB # To check if $VARB is defined
The script /tmp/ defines the global variables VARA and VARB:

sh# cat /tmp/
# Variable definitions
V ARA=“VariableA”
V ARB=“VariableB”
Upon the script execution, variables VARA and VARB are still undefined in the user’s shell environment. There is no way to export variables toward the parent shell environment.

sh# /tmpi/ # Execute the script
sh# echo $VARA # To check if $VARA is defined
sh# echo $VARB # To check if $VARB is defined
Upon the sourcing of the script variables, VARA and VARB remain defined within the user’s shell environment.

sh# . /tmp/ # Source the script
sh# echo $VARA # To check if $V ARA is defined

sh# echo $VARB # To check if $V ARB is defined
The previous discussion is instrumental in understanding the user’s log-in process and the initial definition of the user’s shell environment.

Double Command-Line Scanning
Shell variables are often used on the shell command-lines, as a part of UNIX or shell commands. Unfortunately, sometimes they can easily be misinterpreted. Simply, under certain conditions, shell variables could be understood literally: the variable $VARA from the previous example can be understood as “$VARA” instead of its value “VariableA.” Just think about versatile and powerful UNIX commands (better to say UNIX utilities) like, awk, sed, or other commands that have their own syntax somehow different from the shell syntax. This makes a great difference and could make the use of shell variables very restricted.

The shell response to this situation is the command: eval. This command allows so-called “double command-line scanning,” where the shell variables are first processed, developed, and then replaced for the second command-line processing. For better understanding of this command, let us see how the shell command interpreter processes a command line at all. This is presented in Figure 3.2 and explained in the following text.
  1. The command line is “tokenized,” i.e., split into its constituents: word, keywords, IO redirectors, and semicolons, according to the separating metacharacters: space, tab, new line, ), (, <, >, \, /, and &.
  2. The first token is tested if it is “a single-line unquoted keyword” (a keyword without quotes or continuation character). Shell statements (if, while, until...) and functions are treated as “opening keywords,” set up internally; the processing continues with the next token.
  3. The command is tested against the list of command aliases; eventual aliases are expanded and reprocessed.
  4. The substitution of an eventual user’s home directory.
  5. The variable substitution for any expression with leading $. This is also the second processing step for double-quoted tokens (steps between are skipped).
  6. The command substitution for any single back-quoted expression of the form ԺexpressionԺ or $(expression). The expression is executed and substituted with the obtained result for additional processing.
  7. The evaluation of the arithmetic expressions of the form $((expression)). Remember that the double-quoted expressions are processed differently from others after this step.
  8. The eventual expanded text (as a result of the previous step processing) is now “tokenized” according to the shell environment internal field separators (IFS).
  9. The wildcard expansion of *, ? and [/] pairs, and processing of regular expression operators.
  10. The search for the command in all predefined command directories (according to the shell $PATH or $path variable). This is also the second, and the only, step in processing single-quoted command-line tokens.
At this point everything is ready for the command-line execution. However, if the shell command eval was specified, another round of the command processing will be performed. This is known as double command-line scanning.

The format of the command is: eval args where args includes the actual command itself and command arguments. For better understanding of this command, see the following example. The user’s shell is bash, but it does not have any specific impact on the example (could be any other shell).

Here Document
An extremely powerful feature of the shell programming is its Here Document. The shell redirector of the form:
### myprogram << !EOF mycommandA mycommandB mycommandC !EOF ###

This shell script command-line sequence will start the execution and transfer the further command-line control to myprogram. Command lines that follow until the terminating label !EOF are submitted to and strictly processed by myprogram. The specified label can be any string, but two labels must match literally; no leading or trailing blanks on the terminating line are allowed. Here Document enables an unattended execution of not-shell and not-UNIX commands within the shell script. It is used frequently for inception of SQL, FTP, and other command sequences into the shell environment. Unfortunately Here Document does not support interactive procedures — simply the next command-line is submitted as soon as the previous one is done. Generally the main disadvantage of the shell programming is its inability to act interactively if used unattended. For this purpose Espect or Perl patches are required. Here Document makes shell script programming easier and more powerful.

Few Tips
At the end of this brief overview of certain shell programming topics, few tips for using the shell scripts:
  • A shell script inherits the caller ’s environment, usually the user’s shell. However there are no rules for the initial environment setting. Everything defined-out-of-script is uncertain, including the search path for the implemented commands in the script. Some good advice follows:
Define the PATH variable in the script.
Or, use the full-path command names.
  • It is very common that the fully tested shell script from the command line fails when it is run as a cron job. The reason is simple: cron environment is reduced to several default values, usually insufficient for the successful script execution.
  • Always clean everything that the shell script creates temporarily. Each file is owned by its creator, and remaining temporary files could be obstacle for other script invokers.
  • Pay attention to the standard and error output. The shell scripts are often running in background either.

System Startup and Shutdown

Introductory Notes
UNIX systems run continuously under normal circumstances. Shutting down and powering-off a UNIX system should be done rarely, usually only when a hardware upgrade is being performed or a system is being allocated, or occasionally when another action requiring a system shutdown is performed. In real life, system shutdown is more frequent, because unpredictable situations always occur.

Power-cycling a UNIX system is not the only way the system can be shut down. Rebooting is also a familiar task for any UNIX administrator; UNIX administrators know well how system rebooting can be healthy for overall system maintenance. Nevertheless, keeping the UNIX system running is the most visible task of a system administrator. If the system crashes, everyone will complain, your phone will ring constantly, and you will find yourself anxiously trying to fix the problem and bring the system back into production. Quickly you will learn how important the system you are in charge of really is, and how many users depend on it. Even more important, you will learn how crucial a smooth, fast, proper system startup can be.

Here we cover topics related to normal UNIX system startup and shutdown procedures. Invoking a system startup and shutdown is quite simple; the main requirement is to be the superuser on the system (an easy task for an administrator). On the other hand, making the system behave correctly, especially during startup, requires a great deal of knowledge and administrative skill. Proper system startup is supposed to customize and set the myriad of existing system configuration files that will control each portion of the UNIX system. Some of these files include system-related configuration data, but there are also site-added applications; the bottom line is that the system should be fully operational after any system startup.

Discussing the administration of a running UNIX system without knowing how that system came to be running seems strange; it is as though we are talking about administering a non-existent UNIX system. So this material remains in the beginning by design; it will focus on the topic of global system startup and shutdown, and we will return to individual startup and shutdown issues later, whenever it is appropriate in discussing specific UNIX topics.

From an administrative standpoint, system shutdown is the simpler procedure; at the end of the procedure a system must terminate all running processes, dismount all filesystems, and stop any other system activity.
System shutdown works even if we never touch the default shutdown procedure — or perhaps it is better to say it mostly works, because the author of this text has witnessed a UNIX system that could not be shut down from the command line, and the only choice was to power-cycle the system.
Our administrative task is to provide a graceful system shutdown. Everything must be stopped in a regular way, or the administrator will have to use the brute force method of power-cycling.

System startup, on the other hand, must be done properly or the system will never come up. Obviously, more attention should be paid to system startup, and we will spend much more time discussing the startup procedure than the shutdown process.

System startup is often referred to as system booting. Although “booting” specifies only one phase in the overall system startup, the two terms are commonly interchanged. Strictly speaking, system startup has a broader meaning than system booting. All UNIX systems must be shut down in a regular way before any further action can be taken. You should never directly power-off UNIX systems (such as DOS-based PCs); the shutdown procedure must be implemented, otherwise disk data integrity can be corrupted (a UNIX filesystem could be damaged). The corruption can range between a relatively benign loss of data to heavy filesystem damage, which in the worst case scenario can leave a system unbootable.

The two major UNIX platforms BSD and System V have different startup and shutdown procedures, with, of course, the main differences occurring in startup. Among existingcommercial UNIX flavors, the System V approach is more common; it provides more flexibility and some other administrative advantages. However, the BSD approach is somewhat easier to understand, and we will start our discussion with the BSD startup/ shutdown procedure. Once the startup/shutdown concept is well understood, it will be easy to
continue with the System V procedure.

System Startup
The system startup procedure is a continuous process that a UNIX system goes through, from its initial hardware-determined stage until the final production-ready stage. However, this unique system journey passes through several distinct phases, and each of these phases has its specific characteristics. The startup phases, listed in the order they occur, are:
  1. Bootstrap program execution
  2. Kernel execution
  3. rc system initialization
  4. Terminal line initialization
It is easier to understand the system startup procedure when the whole process is divided into several phases and each of the phases is analyzed separately, so this is the approach we will take. Although each of the listed phases is equally important for successful system startup, the system initialization phase requires the most administrative attention, so most of the following discussion will address this phase.

In each of the startup phases, the system learns enough to execute the next phase. Each phase contributes a bit to the overall system startup. At the very beginning, the system does not know very much; at the very end, the system is ready for multi-user operations.

The Bootstrap Program
The origin of the word boot (as in, “to boot the system”) is bootstrapping, which is the
process of bringing a computer system to life and ready for use. (“Bootstrapping” is
actually the “nerd word” for starting up a computer.) The computer system itself is only
a collection of hardware resources (registers, arithmetic/logical unit, program counter, memories, etc.) capable of executing a sequence of instructions that make a program. The program, stored in the computer’s memory (any kind of memory: ROM, RAM, external
magnetic memory, etc.), defines the system’s activity at every moment, including its first steps during the system startup.

An initial program, the bootstrap program, must be stored in the fast non-volatile memory
directly accessible by a processor, or CPU (CPU stands for central processing unit and is another term for a processor). This portion of the computer memory is known as internal read-only memory (ROM). The execution of the bootstrap program is always automatically initiated when the system is powered-on or when a system hardware reset is applied. It is also initiated when the system is rebooted from the system console. Only the initial part of the bootstrap program, the part sufficient to bring the system into a workable state to deal with other memory types, must be stored in ROM. Once this level is achieved, the bootstrap program execution can be continued from another non-volatile media such as a hard disk, a floppy disk, a tape, or a CD-ROM, or even through the network from a boot-server (in the case of diskless workstations). For UNIX systems, regular system booting is commonly executed from a hard disk, while first-time UNIX OS installation is performed from a CD-ROM (not long ago, a tape was used). The system has to learn enough from the ROM to be able to access a disk to continue the bootstrap program, but it still assumes a simple flat data structure on the disk. A complex disk data organization such as the UNIX filesystem data structure is still too complicated for the system at this stage; more learning is needed to deal with a filesystem. That is why the rest of the bootstrap program is stored in a special part of the disk known as the boot partition (sometimes also known as the boot segment). The main characteristic of the boot partition is its easy access and flat data structure, so the system is able to continue with
the bootstrap program execution, and further learning. The ratio of the percentage of the bootstrap program stored in the ROM versus the disk boot partition varied through time. In the early days of UNIX when only low capacity, expensive ROM was available, the first part of the bootstrap program was reduced to the bare minimum size. Today, systems include high density ROM sufficient to store quite sophisticated bootstrap programs; this makes boot partitions less important, although they are still a part of every system startup. Once the bootstrap program is completely executed, the system is knowledgeable enough to continue with the kernel execution.

Traditionally UNIX presents the only OS running on underlying hardware; and traditionally this is a proprietary hardware for that UNIX flavor. This fact makes a booting process unique and quite straightforward. However, once PC hardware also became common in the UNIX arena, a more flexible booting, with UNIX as one of several choices, emerged as a preferable system characteristic. Linux is an example. On the Linux platform, the three most common booting mechanisms are:
  • To boot Linux from the floppy, and leave hard drive for other OSs
  • To use the Linux loader (LILO) or nowadays GRUB, the most common case
  • To run Loadlin, an MS-DOS program that boots Linux from within DOS
What is exceptional with LILO is the possibility of configuring this loader in different ways to match different needs; multiple-choice booting, including a non-UNIX startup, is also possible. The configured loader should then be installed in the boot sector of the first disk, known as MBR (master boot record). When the system is started, the PC BIOS transfers control to MBR and triggers a corresponding LILO booting. Linux provides an easy LILO configuration through its /etc/lilo.conf configuration file, and the command lilo for its installation as MBR.

The Kernel Execution
The bootstrap program is responsible for loading the UNIX kernel into the system memory. The kernel image, originally named unix under System V, or vmunix under BSD, is intentionally located in the root filesystem, because the root filesystem is the first filesystem the system mounts to access data. Mounting is a UNIX-specific procedure that makes data on the disk accessible. In the past, the kernel image was located in the root directory for easier access, but today, it usually resides in a separate subdirectory. We do not refer to “mounting” the kernel; we usually just say that the kernel image was loaded into the system memory and its execution was started.

The kernel manages all system hardware;

  1. all hardware drivers are part of the kernel, and the only OS access to the system hardware is through the kernel. Therefore the system hardware will be available upon the completion of this phase.
  2. Once control passes to the kernel, it prepares itself to run the system by initializing its internal tables and completing the hardware diagnostics that are part of the boot process. The level of diagnostics implemented varies from one UNIX flavor to another.
  3. At the very end, the kernel verifies the integrity of the root filesystem and remounts it,
  4. and starts three programs that create three basic processes. Two of them, named kernel processes, function wholly within the kernel in the kernel’s privileged execution mode. They are actually portions of the kernel itself, only “dressed” like processes for scheduling reasons.
On BSD systems, the two processes are:
  1. Swapper (process #0), responsible for the “swapping” — to schedule the transfer of whole processes between the main system memory and a mandatory swap partition on the primary disk when system resources are low
  2. Pagedeamon (process #2), responsible for supporting the memory-management system regarding paging — a regular transfer of data in the pages between the main system memory and the disk
On System V systems, the processes are named differently:
sched for the process #0, while the process #2 is replaced with various memory handlers.
The third process created by the system is the init process (process #1), which performs all administrative tasks during the system startup and shutdown. The init process is an extremely important process that enables the creation of all subsequent processes (in UNIX a process can be created only by another parent process).
The init process has the PID = 1, and it is the ancestor of all subsequent UNIX processes and the direct parent of each user’s login shell.
In the case of diskless workstations, the procedure is slightly different. Obviously, the kernel cannot be read from a non existing root filesystem; therefore, it must be downloaded from the network. Further kernel activities are adapted to the diskless environment.

The kernel is quite verbose and it prints messages on the console that report on the current execution status, total memory used and free, and some other information. However, the information reported varies among different UNIX flavors.

The Overall System Initialization
The init process does the rest of the work needed to bring the system into its final stage:
  • Mounting the remaining local disk partitions
  • Performing some filesystem cleanup
  • Bringing on major UNIX subsystems (accounting, printing, etc.)
  • Setting the system’s name and time zone
  • Starting the network
  • Mounting remote filesystems
  • Enabling user logins

rc Initialization Scripts

Most of the initialization activities are specified and carried out by means of the system rc initialization scripts stored in the /etc directory and its subdirectories. Rc initialization scripts are usually named in the way that they include the acronym rc as part of their names (as a prefix, a suffix, or in a fullpath name).
rc stands for run-command and basically explains the purpose of the scripts.
These mostly Bourne shell programs are organized differently on BSD and System V platforms, although their purpose is the same. As with any other script, rc initialization scripts are readable, so we can manage them in a very comprehensive way. Besides that, rc scripts are sufficiently verbose during execution, and this is a great help if the system hangs midway through the startup, or if there are any other related problems. Main administrative activities are related to this phase. System site-related customization means editing the rc initialization scripts.
  • any system upgrade means to upgrade (or to add) rc initialization scripts
  • any startup modification means to do something with rc initialization scripts.
The rest of this section exclusively addresses these issues. Afterward, a full picture of the necessary administration in this segment should be complete.

Terminal Line Initialization
The terminal line initialization is a part of the overall system initialization; however, the implemented initialization technique is quite different than that in the rc system initialization, which is sufficient reason to handle this topic separately. UNIX is extremely cautious with terminal line initializationterminal lines are “gates” to the outside world. Users access the system via terminal lines, and the essence of UNIX existence is to serve users. Once the initialization scripts have been executed, the system is fully operational, except for the fact that no one can log in to the system. In order to provide login via a particular terminal line, there must be a corresponding controlling process listening on it (usually the getty process, but the ttymon is used on the Solaris platform). At the final initialization phase, init spawns the getty processes to all indicated terminals and the startup procedure is completed. Today, users typically log in over a network using pseudo-terminals; however, the getty program is still doing its job.

System States
Once the initialization activities are completed, the UNIX system enters the multi-user mode, and users may log in to the system. But init can also place the system in single-user mode instead of completing the initialization tasks required for multi-user mode.

The single-user mode corresponds to a functionally reduced UNIX system. In single-user mode, a UNIX system looks very much like a personal computer.
  • The single-user mode is primarily dedicated to administrative and maintenance activities that require complete control over the system.
  • The user has all superuser privileges. In some cases, the system will automatically enter single-user mode if there are any problems in the boot process that the system cannot handle on its own (for example, filesystem problems that fsck cannot fix) so the system administrator must resolve the problem. The init simply spawns the Bourne shell on the system’s console and waits for it to terminate before continuing with the rest of the startup sequence.Entering CTRL-D or the exit command from the shell prompt can terminate the spawned single-user shell.
  • Once this is done, the system may continue into multi-user mode.
Single-user mode represents a minimal system startup with no daemons running, so many UNIX facilities are disabled. Only the root filesystem is mounted (in the most common case) and a restricted number of commands are available (commands residing in the root filesystem). Under normal circumstances, other filesystems can be mounted by hand to access other commands.

Single-user mode can be a security problem for a system, because full control over the system is granted.

  • On older UNIX systems, no password was required, but physical access to the system was required in the single-user mode.
  • On some systems, a front panel lock with normal (secure) vs. maintenance (service) key positions enabled multi-user vs. single-user mode; the system protection was the key, and only authorized personnel could acquire the key.
  • Modern UNIX systems usually require a root password to enter single-user mode.
None of these approaches are perfect, and all of them have some disadvantage. A request for the root password could make difficulties under different circumstances, if the root password was forgotten. While the BSD flavored system could be in one of three possible states — off, single-user, and multi-user mode — the System V platform explicitly defines a series of system states, called run-levels designated by a one-character name. System V run-levels are flavor
dependent; an example is listed in the table above. To display the current system run-level, the following command is available:.

$ who -r
. run-level 3 Mar 14 11:14 3 0 S

The system was taken to run-level 3, from run-level S, via run-level 0, on March 14, at 11:14. The leading dot is by default at the beginning of the line.

On the System V platform, movement between run-levels is managed by init, and each run-level is controlled by its own set of initialization scripts.

The Outlook of a Startup Procedure
UNIX systems are configured to boot automatically when powered-on. If this is not possible, systems enter some form of the “ROM monitor mode” — a restricted ROM resident command interpreter that enables essential diagnostics, booting, and some other basic system activities. The ROM monitor mode is also the state that the system enters after being shut down; in that state, a system can be safely powered off. On some systems there is also a keystroke combination to enter this mode — for example on Sun Microsystems systems, the key (STOP-A) followed by the specific ROM monitor prompt “OK>.”

The ROM monitor always provides the boot command, specified as “b” or “boot,” among the other commands it provides. Certain options sufficient to control the system startup when problems are encountered (to boot the system from different media, into different modes, etc.) are also provided. The default booting media is the hard disk. On old UNIX systems, manual booting from the ROM monitor was a two-stage procedure:

  1. The boot command first loaded a boot program with a stand-alone shell (actually a mini-operating system).
  2. A second command was then issued in a stand-alone shell to load UNIX kernel.
This two-step procedure looked like this:

$$ unix

Different prompts specify two steps in the boot procedure. The technology available in the past limited the bootstrap program possibilities, which led to a more complicated startup procedure. Today all UNIX flavors provide a relatively verbose system startup; a number of messages are directed to the console indicating the stage and status of the startup procedure. It is highly recommended that you monitor the system startup on the console. Otherwise, some trouble messages can remain undetected, which leads to a high probability for later surprises. The startup sequences for two system user modes are presented in Figures 4.1 and 4.2. The UNIX system named “atlas” is running Solaris 2.x.; brief comments follow.

  1. The Sun logo and first five lines are printed from the bootstrap program. These lines list basic system configuration and identification data, as well as the kind of boot device.
  2. The somewhat cryptic description of a boot device indicates an SCSI disk.
  3. The kernel prints only two identification lines that include the system version and release.
  4. Other lines are printed from initialization scripts invoked by the program init.
  5. One of the lines indicates that the system was customized. The message that indicates the start of the HTTP service is not a part of a regular OS installation — obviously, this site has been customized to provide an Internet service.
  6. At the end, the login prompt is displayed upon the console initialization.
The startup procedure includes filesystem checking, one of the most important activities performed by the fsck utility. The filesystem verifications are different on BSD and System V platforms.

  • BSD checks all filesystems on every boot;
  • System V does not check filesystems if they were dismounted normally when the system last went down (the fsstat command is used for this purpose), and faster booting is enabled.
Filesystem checking can result in the display of many messages depending on the current filesystem status. If more serious filesystem corruption is encountered, the system is left in single-user mode, and manual filesystem checking and repair by the administrator may be required.

A single-user startup sequence is much shorter, and it includes the boot and kernel lines. The next two lines about the network interface configuration and host’s name are printed from corresponding initialization scripts involved in the system single-user startup. Finally, the console is activated and the user is informed of two possibilities:

  1. Enter the system in single-user mode by entering the root password
  2. Or continue with multi-user startup by entering [Ctrl-D]
If [Ctrl-D] is entered, the system continues with the multi-user startup, as in the previous case.

Initialization Scripts
Once the init process is born, the system startup is determined by a series of rc initialization scripts which define a detailed procedure to bring the system into the multi-user mode. This is the most common case, although other system modes (run-levels) are also possible. These files control all custom-defined and site-dependent items (there are multiple rc initialization scripts), and they are executed sequentially. Generally, rc initialization scripts represent Bourne shell script files, executable at any time and on any UNIX platform. (The Bourne shell is the default shell, and it is available at the very early system stage on every UNIX platform.) The rc initialization scripts do not differ from any other shell script, except at the time of their execution. (This, by the way, is why the prefix “rc” is used in their description, as well as in the name.) However, they can also be executed from the command line at any time, and administrators can make full use of this opportunity: on System V, individual function-specific initialization scripts are often used to stop and start specific UNIX functions during regular system production. On modern UNIX platforms, sometimes Korn shell rc initialization scripts are also included (for example, on the HP-UX platform) which indicated the early availability of the Korn shell.

Understanding rc initialization scripts is a vital part of system administration — this is the place for system customization. A system administrator must be familiar with these files, their locations and, in many cases, their contents. Only then is full control over the system startup possible, and quick corrective action can follow any problem encountered during system boot time. Each modification in the initialization scripts must be done very carefully with respect for the basic administrative rule: save original script files before making any changes. If this rule is not followed, various problems can ensue.

Despite the fact that rc initialization scripts on both UNIX platforms BSD and System V serve the same purpose, the mechanisms by which they are initiated and executed are quite different. These differences require great attention, knowledge, and skills from system administrators working in a heterogeneous environment, which is very common today. Today, the System V rc approach prevails — the System V organization of the rc initialization scripts offers more flexibility and other administrative advantages. We will discuss System V initialization in greater detail after a quick survey of the BSD-style initialization.

BSD Initialization

The BSD rc Scripts
Originally, the BSD initialization was controlled only by two rc initialization scripts: /etc/rc and /etc/rc.local. A general system initialization was supported by the /etc/rc script, while the /etc/rc.local script referred to a local site, i.e., to issues that should be customized (probably a more appropriate script name would be “” to avoid any possible confusion toward the logical association with a “network-local relationship”).

  • During system booting to the multi-user mode, init executed the rc script,
  • which in turn executed the rc.local script.
  • If a single-user boot was performed, scripts were only partially executed; the remaining parts were executed when the single-user shell was exited.
Having only two rc initialization scripts would lead one to believe that system maintenance was easy, but in fact the reality is quite the opposite. The work required for system initialization remained the same, regardless of how many rc scripts were involved, and huge rc script files were more difficult to manage and more vulnerable to corruption during modification. It could be very difficult to find an appropriate control sequence, items were often doubled, and so on.

SunOS introduced additional script files: /etc/rc.boot and /etc/rc.single.

  • The program init invokes first rc.boot script and
  • from there rc.single (regardless of whether the system is booting to single vs. multi-user mode);
  • then the /etc/rc and /etc/rc.local files follow.

BSD Initialization Sequence

For a clearer picture, the block diagram of the SunOS execution sequence is presented in Figure 4.3 (it is assumed the system is booting from the local disk). The SunOS organization made a clear distinction between single and multiple-user modes; it was immediately easier to follow any problems that developed in the system booting. To make system customization easier,
SunOS provided a special interactive script named /usr/etc/install/run_configure that was invoked only once, the very first time the system was started upon the OS installation. Through the provided dialogue, the required parameters such as: system name, time zone, date, time, and network data were entered. The system administrator had to answer a number of questions, and new system and network data were saved for future use. The dialogue was performed via the system console. Upon successful completion, the program is never again invoked; subsequent modification can be done directly in the rc scripts.
In the single-user mode, the only way to communicate with the system is via the console; other terminals are not initialized at all. SunOS assumes that anyone who has physical access to the console is an administrator, because from the console it is easy to gain full control over the system. There is no additional system protection. All rc files live in the /etc directory; this is an example from SunOS 4.1.3:

$ ls -l /etc | grep rc
-rw-r--r--     1 root 2993 Jan 20 1996 rc
-rw-r--r--     1 root 5476 Jun 23 1996 rc.boot
-rw-r--r--     1 root  352 Jan 20 1996 rc.ip
-rw-r--r--     1 root 6169 Aug 3 1997  rc.local
-rw-r--r--     1 root 5911 Jan 20 1996 rc.local.orig
-rw-r--r--     1 root 2172 Jan 20 1996 rc.single

We can easily recognize all of the listed files; the file rc.local was modified according to the local (site) requirements, and the original file was saved. An exception is the file rc.ip, which is used to start up diskless systems. All of the listed files are excellent examples of what shell scripts should look like; extremely skillful programmers write them, and it is a good idea to read them to learn the art of shell programming. However, this is out of the scope of this text.

The description of the BSD system startup should be sufficient to explain how a UNIX system is brought into an operational stage. To conclude this discussion, an additional brief report related to this topic is presented. This report is taken directly from the manual pages for rc files on the SunOS platform; nevertheless, there are some discrepancies between the actual initialization scripts and this report, even though the described scripts and manual pages belong to the very same system. This is not so unusual, and a UNIX administrator must be prepared for such surprises. The supplied online documentation simply does not always fully support all system changes and upgrades.

$ man rcfiles
rc, rc.boot, rc.local — command scripts for auto-reboot and daemons
rc and rc.boot are command scripts that are invoked by init(8) to perform filesystem
housekeeping and to start system daemons. rc.local is a script for commands that are
pertinent only to a specific site or client machine.
rc.boot sets the machine name and, if on SunOS 4.1.1 Rev B or later, invokes ifconfig,
which uses RARP to obtain the machine’s IP address from the NIS network. Then a
“whoami” bootparams request is used to retrieve the system’s host-name, NIS domain
name, and default router. The ifconfig and hostconfig programs set the system’s host-
name, IP address, NIS domain name, and default router in the kernel.
If coming up multi-user, rc.boot runs fsck(8) with the -p option. This “preens” the disks of
minor inconsistencies resulting from the last system shutdown and checks for serious incon-
sistencies caused by hardware or software failure. If fsck(8) detects a serious disk problem, it
returns an error and init(8) brings the system up in single-user mode. When coming up
single-user, when init(8) is invoked by fastboot(8), or when it is passed the -b flag from
boot(8S), functions performed in the rc.local file, including this disk check, are skipped.
Next, rc runs. If the system came up single-user, rc runs when the single-user shell
terminates (see init(8)). It mounts 4.2 filesystems and spawns a shell for /etc/rc.local,
which mounts NFS filesystems, runs sysIDtool (if on SunOS 4.1.1 Rev B or later) to
set the system’s configuration information into local configuration files, and starts local
daemons. After rc.local returns, rc starts standard daemons, preserves editor files, clears
/tmp, starts system accounting (if applicable), starts the network (where applicable),
and if enabled, runs savecore(8) to preserve the core image after a crash.

System V Initialization
System V organizes the initialization procedure in a more flexible, but also a more complex way using up to three levels of initialization files. During a system startup, when init takes control from the kernel, it scans its configuration file /etc/inittab to learn what to do next. We should recall that System V can have multiple run-levels. The file /etc/inittab defines init’s action whenever the system enters a new level; the commands to execute at each run-level are specified in the corresponding inittab entries. Usually, the entries are initialization script files named rcn (where “n” is a run-level number); the scripts files themselves are located in the directory /etc, or sometimes in /sbin (HP-UX platform). The various rcn scripts in turn invoke other scripts that reside in the corresponding subdirectories rcn.d (again, “n” represents the specified run-level). A simplified version of the System V rebooting procedure is illustrated in Figure 4.4; the rebooting procedure

  1. first shuts down a system (the run-level 0) and then
  2. brings a system into a normal operating state (in this case the run-level 2).

The Configuration File /etc/inittab

We will start with init’s configuration file /etc/inittab; here is an example:

$ cat /etc/inittab (from Red Hat Linux, partly presented)
# inittab This file describes how the INIT process should set up
# the system in a certain run-level.
# Default run-level. The run-levels used by RHS are:
# 0 — halt (Do NOT set initdefault to this)
# 1 — Single user mode
# 2 — Multi-user, without NFS (The same as 3, if you do not have networking)
# 3 — Full multi-user mode
# 4 — unused
# 5 — X11
# 6 — reboot (Do NOT set initdefault to this)
# System initialization
l0:0:wait:/etc/rc.d/rc 0
l1:1:wait:/etc/rc.d/rc 1
l2:2:wait:/etc/rc.d/rc 2
l3:3:wait:/etc/rc.d/rc 3
l4:4:wait:/etc/rc.d/rc 4
l5:5:wait:/etc/rc.d/rc 5
l6:6:wait:/etc/rc.d/rc 6
# Things to run in every run-level.

Each entry in the /etc/init file is of the form:

With the following definitions of the individual fields:

  • cc   Two-character case-sensitive label identifying the entry (some new implementations allow up to 14 characters)
  • states   A list of the run-levels to which the entry applies; if blank, indicates all run-levels
  • action  
  • wait   Start the process and wait for it to finish before going on to the next entry for this run-level
    respawn   Start the process and automatically restart it when it dies

    once   Start the process if it is not already running; do not wait for it

    boot   Only execute entry at boot time and do not wait for it 

    bootwait   Only execute entry at boot time and wait for it to finish

    initdefault   Specify the default run-level for system reboot

    sysinit   Use to initialize the console

    off   Kill the process if it is running         

  • process   The command to execute
The system scans inittab entries from the top down, checks that they belong to a current run-level, and executes them sequentially, respecting the contents of the entry fields. Let us analyze the previous example.

  1. The first entry named “id” is not the executable one; this entry (determined as “initdefault”) specifies the default run-level (here it is run-level 2) to be implemented when the run-level is not explicitly specified by init itself. 
  2. The following entry “si,” marked as “sysinit,” must be executed first to make the console and some other initial items operational. The specified initialization script /etc/rc.d/rc.sysinit performs many of the “housecleaning” jobs to prepare the system for other run-level specific scripts that will come afterward. 
  3. The run-level scripts for different run-levels are specified by subsequent inittab entries identified as l0 to l6, for the run-levels 0 to 6; this is actually the same rc initialization script named /etc/rc.d/rc, invoked with an argument that specifies the run-level (argument 0 to 6). The script invokes other specific “stop” and “start” scripts needed for specific run-level initialization. This part of the /etc/inittab file is crucial to our discussion; other inittab entries are not presented at all, and they relate to other required general initialization tasks such as power supply control, terminal line initialization, etc.
Linux has located rc initialization scripts in a separate directory /etc/rc.d and its subdirectories, as we see in the following example:

$ ls -l /etc/rc.d
total 18
drwxr-xr-x     2 root root 1024 May 13 12:24 init.d
-rwxr-xr-x     1 root root 1871 Oct 15 1998  rc
-rwxr-xr-x     1 root root  693 Oct 15 1998  rc.local
-rwxr-xr-x     1 root root 7165 Oct 15 1998  rc.sysinit
drwxr-xr-x     2 root root 1024 May 13 12:24 rc0.d
drwxr-xr-x     2 root root 1024 May 13 12:24 rc1.d
drwxr-xr-x     2 root root 1024 May 13 12:24 rc2.d
drwxr-xr-x     2 root root 1024 May 13 12:24 rc3.d
drwxr-xr-x     2 root root 1024 May 13 12:24 rc4.d
drwxr-xr-x     2 root root 1024 May 13 12:24 rc5.d
drwxr-xr-x     2 root root 1024 May 13 12:24 rc6.d

Besides the scripts rc, rc.sysinit, and rc.local, which are accomplishing specific tasks, other needed scripts for particular run-levels are located in the corresponding subdirectories rc0.d to rc6.d. The subdirectory init.d is a “depot” directory for all scripts, and it will be explained later. The described startup procedure is almost identical on other System V platforms; the existing differences are mostly concentrated in the naming of the initialization scripts.
Here is another example:

To be continued...


UNIX administration : a comprehensive sourcebook for effective systems and network management by Bozidar Levi (2002)
ISBN 0-8493-1351-1

No comments:

Post a Comment