
- •Contents
- •Preface to second edition
- •1 Introduction
- •1.2 Applying technology in an environment
- •1.3 The human role in systems
- •1.4 Ethical issues
- •1.7 Common practice and good practice
- •1.8 Bugs and emergent phenomena
- •1.10 Knowledge is a jigsaw puzzle
- •1.11 To the student
- •1.12 Some road-maps
- •2 System components
- •2.2 Handling hardware
- •2.3 Operating systems
- •2.4 Filesystems
- •2.5 Processes and job control
- •2.6 Networks
- •2.7 IPv4 networks
- •2.8 Address space in IPv4
- •2.9 IPv6 networks
- •3 Networked communities
- •3.1 Communities and enterprises
- •3.2 Policy blueprints
- •3.4 User behavior: socio-anthropology
- •3.5 Clients, servers and delegation
- •3.6 Host identities and name services
- •3.8 Local network orientation and analysis
- •4 Host management
- •4.1 Global view, local action
- •4.2 Physical considerations of server room
- •4.3 Computer startup and shutdown
- •4.5 Installing a Unix disk
- •4.6 Installation of the operating system
- •4.7 Software installation
- •4.8 Kernel customization
- •5 User management
- •5.1 Issues
- •5.2 User registration
- •5.3 Account policy
- •5.4 Login environment
- •5.5 User support services
- •5.6 Controlling user resources
- •5.7 Online user services
- •5.9 Ethical conduct of administrators and users
- •5.10 Computer usage policy
- •6 Models of network and system administration
- •6.5 Creating infrastructure
- •6.7 Competition, immunity and convergence
- •6.8 Policy and configuration automation
- •7.2 Methods: controlling causes and symptoms
- •7.4 Declarative languages
- •7.6 Common assumptions: clock synchronization
- •7.7 Human–computer job scheduling
- •7.9 Preventative host maintenance
- •7.10 SNMP tools
- •7.11 Cfengine
- •8 Diagnostics, fault and change management
- •8.1 Fault tolerance and propagation
- •8.2 Networks and small worlds
- •8.3 Causality and dependency
- •8.4 Defining the system
- •8.5 Faults
- •8.6 Cause trees
- •8.7 Probabilistic fault trees
- •8.9 Game-theoretical strategy selection
- •8.10 Monitoring
- •8.12 Principles of quality assurance
- •9 Application-level services
- •9.1 Application-level services
- •9.2 Proxies and agents
- •9.3 Installing a new service
- •9.4 Summoning daemons
- •9.5 Setting up the DNS nameservice
- •9.7 E-mail configuration
- •9.8 OpenLDAP directory service
- •9.10 Samba
- •9.11 The printer service
- •9.12 Java web and enterprise services
- •10 Network-level services
- •10.1 The Internet
- •10.2 A recap of networking concepts
- •10.3 Getting traffic to its destination
- •10.4 Alternative network transport technologies
- •10.5 Alternative network connection technologies
- •10.6 IP routing and forwarding
- •10.7 Multi-Protocol Label Switching (MPLS)
- •10.8 Quality of Service
- •10.9 Competition or cooperation for service?
- •10.10 Service Level Agreements
- •11 Principles of security
- •11.1 Four independent issues
- •11.2 Physical security
- •11.3 Trust relationships
- •11.7 Preventing and minimizing failure modes
- •12 Security implementation
- •12.2 The recovery plan
- •12.3 Data integrity and protection
- •12.5 Analyzing network security
- •12.6 VPNs: secure shell and FreeS/WAN
- •12.7 Role-based security and capabilities
- •12.8 WWW security
- •12.9 IPSec – secure IP
- •12.10 Ordered access control and policy conflicts
- •12.11 IP filtering for firewalls
- •12.12 Firewalls
- •12.13 Intrusion detection and forensics
- •13 Analytical system administration
- •13.1 Science vs technology
- •13.2 Studying complex systems
- •13.3 The purpose of observation
- •13.5 Evaluating a hierarchical system
- •13.6 Deterministic and stochastic behavior
- •13.7 Observational errors
- •13.8 Strategic analyses
- •13.9 Summary
- •14 Summary and outlook
- •14.3 Pervasive computing
- •B.1 Make
- •B.2 Perl
- •Bibliography
- •Index

Chapter 3
Networked communities
System administration is not just about machines and individuals, it is about communities. There is the local community of users on multi-user machines; then there is the local area network community of machines at a site. Finally, there is the global community of all machines and networks in the world.
We cannot learn anything about a community of networked computer systems without knowing where all the machines are, both physically and in the network, what their purposes are, and how they interrelate to one another. Normally we do not start out by building a network of computers from nothing, rather we inherit an existing network, serviceable or not; thus the first step is to acquaint ourselves with the system at hand.
The aim of this chapter is to learn how to navigate network systems using standard tools, and place each piece of the puzzle into the context of the whole.
3.1 Communities and enterprises
The basic principle of communities is:
Principle 5 (Communities). What one member of a cooperative community does affects every other member and vice versa. Each member of the community therefore has a responsibility to consider the well-being of the other members of the community.
When this principle is ignored, it leads to conflict. One attempts to preserve the stability of a community by making rules, laws or policies. The main difference between these is only our opinion of their severity: rules and laws do not exist because there are fundamental rights and wrongs, they exist because there is a need to summarize the consensus of opinion in a community group. A social rule thus has two purposes:
•To provide a widely accepted set of conventions that simplify decisions, avoiding the need to think through things from first principles every time.
•To document the will of the community for reference.

76 |
CHAPTER 3. NETWORKED COMMUNITIES |
Rules can never cover every eventuality. They are convenient approximations to reality that summarize common situations. An idealist might hope that rules would never be used as a substitute for thought, however this is just how they are used in practice. Rules simplify the judgment process for common usage, to avoid constant re-evaluation (and perhaps constant change).
We can apply this central axiom for the user community of a multiuser host:
Corollary to principle (Multiuser communities). A multiuser computer system does not belong to any one user. All users must share the common resources of the system. What one user does affects all other users and vice versa. Each user has a responsibility to consider the effect of his/her actions on all the other users.
and also for the world-wide network community:
Corollary to principle (Network communities). A computer that is plugged into the network is no longer just our own. It is part of a society of machines which shares resources and communicates with the whole. What that machine does affects other machines. What other machines do affects that machine.
The ethical issues associated with connection to the network are not trivial, just as it is not trivial to be a user in a multiuser system, or a member of a civil community. Administrators are, in practice, responsible for their organization’s conduct to the entire rest of the Internet, by ensuring conformance with policy. This great responsibility should be borne wisely.
3.2 Policy blueprints
By placing a human–computer system into an environment that it has no direct control over, we open it up to many risks and random influences. If we hope to maintain a predictable system, it is important to find a way to relate to and make sense of these external factors. Moreover, if we wish to maintain a predictable system, then we need to know how to recognize it: what should the system look like and how should it behave? The tool for accomplishing this is policy.
Definition 2 (Policy). Policy is a statement of aims and wishes that is codified, as far as possible, into a formal blueprint for infrastructure and a schema of responses (contingencies) for possible events.
A policy’s aim is to maintain order in the face of the chaos that might be unleashed upon it–either from the environment that it does not control, or from a lack of control of its own component parts. Any system can spiral out of control if it is not held in check. By translating hopes and wishes into concrete rules and regimens, we build a model for what a predictable system should look like.
•A blueprint of infrastructure.
•Production targets (resource availability).
•Restriction of behavior (limiting authority, access).
•Stimulus–response checklists (maintenance).

3.3. SYSTEM UNIFORMITY |
77 |
A policy determines only an approximate state for a human–computer system – not a state in the sense of a static or frozen configuration, but rather a dynamical equilibrium or point of balance. Human–computer systems are not deterministic, but the aim of policy is to limit the unpredictable part of their behavior to the level of background noise.
3.3 System uniformity
The opportunity to standardize parts of a system is an enticing prospect that can potentially lead to great simplification; but that is not the full story. Given the chance to choose the hardware and software at a site, one can choose a balance between two extreme strategies: to standardize as far as possible, or to vary as much as possible. Curiously, it is not necessarily true that standardization will always increase predictability. That would be true in a static system –but in a real life, dynamical system we have to live with the background noise caused by the parts that we do not control.
Strategically, trying ‘every which way’, i.e. every possible variation on a theme, can pay off in terms of productivity. Moreover, a varied system is less vulnerable to a single type of failure. Thus, if we look at the predictability of productivity, a certain level of variation can be an advantage. However, we must find an appropriate balance between these two principles:
Principle 6 (Uniformity). A uniform configuration minimizes the number of differences and exceptions one has to take into account later, and increases the static predictability of the system. This applies to hardware and software alike.
Principle 7 (Variety). A variety of configurations avoids ‘putting all eggs in one basket’. If some components are poor, then at least not all will be poor. A strategy of variation is a way of minimizing possible loss.
It is wise for system administrators to spend time picking out reliable hardware and software. The more different kinds of system we have, the more difficult the problem of installing and maintaining them, but if we are uncertain of what is best, we might choose to apply a random sample in order to average out a potential loss. Ideally perhaps, one should spend a little time researching the previous experiences of others in order to find a ‘best choice’ and then standardize to a large degree.
PC hardware is often a melange of random parts from different manufacturers. Much work can be saved by standardizing graphics and network interfaces, disk sizes, mice and any other devices that need to be configured. This means not only that hosts will be easier to configure and maintain, but also that it will be easier to buy extra parts or cannibalize systems for parts later. On the other hand, automated agents like cfengine can make the task of maintaining a variety of options a manageable task.
With software, the same principle applies: a uniform software base is easier to install and maintain than one in which special software needs to be configured

78 |
CHAPTER 3. NETWORKED COMMUNITIES |
in special ways. Fewer methods are available for handling the differences between systems; most administration practices are based on standardization. However, dependence on one software package could be risky for an organization. There is clearly a complex discussion around these issues.
3.4 User behavior: socio-anthropology
Most branches of computer science deal primarily with software systems and algorithms. System administration is made more difficult by the fact that it deals with communities and is therefore strongly affected by what human beings do. In short, a large part of system administration can be characterized as sociology or anthropology.
A newly installed machine does not usually require attention until it is first used, but as soon as a user starts running programs and storing data, the reliability and efficiency of the system are tested. This is where the challenge of system administration lies.
The load on computers and on networks is a social phenomenon: it peaks in response to patterns of human behavior. For example, at universities and colleges network traffic usually peaks during lunch breaks, when students rush to the terminal rooms to surf on the web or to read E-mail. In industry the reverse can be true, as workers flee the slavery of their computers for a breath of fresh air (or polluted air). In order to understand the behavior of the network, the load placed on servers and the availability of resources, we have to take into account the users’ patterns of behavior (see figure 3.1).
3.5 Clients, servers and delegation
At the heart of all cooperation in a community is a system of centralization and delegation. No program or entity can do everything alone, nor is everyone expected to do so. It makes sense for certain groups to specialize in performing certain jobs. That is the function of a society and good management.
Principle 8 (Delegation I). Leave experts to do their jobs. Assigning responsibility for a task to a body which specializes in that task is a more efficient use of resources.
If we need to find out telephone numbers, we invent the directory enquiry service: we give a special body a specific job. They do the phone-number research (once and for everyone) and have the responsibility for dealing out the information on request. If we need a medical service, we train doctors in the specialized knowledge and trust them with the responsibility. That is much more efficient than expecting every individual to have to research phone numbers by themselves, or to study medicine personally. The advantage with a service is that one avoids repeating work unnecessarily and one creates special agents with an aptitude for their task. In database theory, this process is called normalization of the system.
The principle of specialization also applies in system administration. Indeed, in recent years the number of client-server systems has grown enormously, because

3.5. CLIENTS, SERVERS AND DELEGATION |
79 |
60 |
|
|
|
|
|
|
|
45 |
|
|
|
|
|
|
|
30 |
|
|
|
|
|
|
|
15 |
|
|
|
|
|
|
|
0 |
24 |
48 |
72 |
96 |
120 |
144 |
168 |
0 |
Figure 3.1: E-mail traffic at Oslo College measured over the course of many weeks. The plot shows the weekly average from Monday to Sunday. Over each 24 hour period, there is a daily peak showing users’ working hours, and during the week, there is a peak around midweek, and little activity during the weekends.
of possibilities offered by networking. Not only can we give a special daemon on one host a special job, but we can say that that daemon will do the job for every other host on the network also. As long as the load placed on the network does not lead to a bottleneck, this is a very efficient centralization of resources. Clearly, the client-server model is an extended way of sharing resources. In that sense it is like a distributed generalization of the kernel.1
The client-server nomenclature has been confused by history. A server is not a host, but a program or process which runs on a host. A client is any process which requires the services of a server. In Unix-like systems, servers are called daemons. In Windows they are just called services. Unfortunately, it is common to refer to the host on which a server process runs as being a server. This causes all sorts of confusion.
The name ‘server’ was usurped, early on, for a very specific client-server relationship. A server is often regarded as a large machine which performs some difficult and intensive task for the clients (an array of workstations). This prejudice comes from the early days when many PC-workstations were chained together in a network to a single PC which acted as file-server, and printer server, sharing a disk and printer with all of the machines. The reason for this architecture, at the time, was that the operating system of that epoch MS-DOS was not capable of
1In reality, there are many levels at which the client-server model applies. For example, many system calls can be regarded as client-server interactions, where the client is any program and the server is the kernel.