AFS Server Processes and the Cache Manager

As mentioned in Servers and Clients, AFS file server machines run a number of processes, each with a specialized function. One of the main responsibilities of a system administrator is to make sure that processes are running correctly as much of the time as possible, using the administrative services that the server processes provide.

The following list briefly describes the function of each server process and the Cache Manager; the following sections then discuss the important features in more detail.

The File Server, the most fundamental of the servers, delivers data files from the file server machine to local workstations as requested, and stores the files again when the user saves any changes to the files.

The Basic OverSeer Server (BOS Server) ensures that the other server processes on its server machine are running correctly as much of the time as possible, since a server is useful only if it is available. The BOS Server relieves system administrators of much of the responsibility for overseeing system operations.

The third-party Kerberos Server replaces the old Authentication Server and helps ensure that communications on the network are secure. It verifies user identities at login and provides the facilities through which participants in transactions prove their identities to one another (mutually authenticate).

The Protection Server helps users control who has access to their files and directories. Users can grant access to several other users at once by putting them all in a group entry in the Protection Database maintained by the Protection Server.

The Volume Server performs all types of volume manipulation. It helps the administrator move volumes from one server machine to another to balance the workload among the various machines.

The Volume Location Server (VL Server) maintains the Volume Location Database (VLDB), in which it records the location of volumes as they move from file server machine to file server machine. This service is the key to transparent file access for users.

The Update Server distributes new versions of AFS server process software and configuration information to all file server machines. It is crucial to stable system performance that all server machines run the same software.

The Backup Server maintains the Backup Database, in which it stores information related to the Backup System. It enables the administrator to back up data from volumes to tape. The data can then be restored from tape in the event that it is lost from the file system.

The Salvager is not a server in the sense that others are. It runs only after the File Server or Volume Server fails; it repairs any inconsistencies caused by the failure. The system administrator can invoke it directly if necessary.

The Network Time Protocol Daemon (NTPD) is not an AFS server process per se, but plays a vital role nonetheless. It synchronizes the internal clock on a file server machine with those on other machines. Synchronized clocks are particularly important for correct functioning of the AFS distributed database technology (known as Ubik); see Configuring the Cell for Proper Ubik Operation. The NTPD is usually provided with the operating system.

The Cache Manager is the one component in this list that resides on AFS client rather than file server machines. It not a process per se, but rather a part of the kernel on AFS client machines that communicates with AFS server processes. Its main responsibilities are to retrieve files for application programs running on the client and to maintain the files in the cache.

The File Server

The File Server is the most fundamental of the AFS server processes and runs on each file server machine. It provides the same services across the network that the UNIX file system provides on the local disk:

  • Delivering programs and data files to client workstations as requested and storing them again when the client workstation finishes with them.

  • Maintaining the hierarchical directory structure that users create to organize their files.

  • Handling requests for copying, moving, creating, and deleting files and directories.

  • Keeping track of status information about each file and directory (including its size and latest modification time).

  • Making sure that users are authorized to perform the actions they request on particular files or directories.

  • Creating symbolic and hard links between files.

  • Granting advisory locks (corresponding to UNIX locks) on request.

The Basic OverSeer Server

The Basic OverSeer Server (BOS Server) reduces the demands on system administrators by constantly monitoring the processes running on its file server machine. It can restart failed processes automatically and provides a convenient interface for administrative tasks.

The BOS Server runs on every file server machine. Its primary function is to minimize system outages. It also

  • Constantly monitors the other server processes (on the local machine) to make sure they are running correctly.

  • Automatically restarts failed processes, without contacting a human operator. When restarting multiple server processes simultaneously, the BOS server takes interdependencies into account and initiates restarts in the correct order.

  • Accepts requests from the system administrator. Common reasons to contact BOS are to verify the status of server processes on file server machines, install and start new processes, stop processes either temporarily or permanently, and restart dead processes manually.

  • Helps system administrators to manage system configuration information. The BOS server automates the process of adding and changing server encryption keys, which are important in mutual authentication. The BOS Server also provides a simple interface for modifying two files that contain information about privileged users and certain special file server machines. For more details about these configuration files, see Common Configuration Files in the /usr/afs/etc Directory.

The Kerberos Server

The Kerberos Server performs two main functions related to network security:

  • Verifying the identity of users as they log into the system by requiring that they provide a password. The Kerberos Server grants the user a ticket, which is converted into a token to prove to AFS server processes that the user has authenticated. For more on tokens, see Complex Mutual Authentication.

  • Providing the means through which server and client processes prove their identities to each other (mutually authenticate). This helps to create a secure environment in which to send cross-network messages.

The Kerberos Server is a required service which is provided by a third-party Kerberos server that supports version 5 of the Kerberos protocol. Kerberos server software is included with some operating systems or may be acquired separately. MIT Kerberos, Heimdal, and Microsoft Active Directory are known to work with OpenAFS as a Kerberos Server. (Most Kerberos commands begin with the letter k). This technology was originally developed by the Massachusetts Institute of Technology's Project Athena.

The Kerberos Server also maintains the Authentication Database, in which it stores user passwords converted into encryption key form as well as the AFS server encryption key. To learn more about the procedures AFS uses to verify user identity and during mutual authentication, see A More Detailed Look at Mutual Authentication.

Note

The Authentication Server known as kaserver which uses Kerberos 4 is obsolete and has been replaced by the Kerberos Server. All references to the Kerberos Server in this guide refer to a Kerberos 5 server.

The Protection Server

The Protection Server is the key to AFS's refinement of the normal UNIX methods for protecting files and directories from unauthorized use. The refinements include the following:

  • Defining seven access permissions rather than the standard UNIX file system's three. In conjunction with the UNIX mode bits associated with each file and directory element, AFS associates an access control list (ACL) with each directory. The ACL specifies which users have which of the seven specific permissions for the directory and all the files it contains. For a definition of AFS's seven access permissions and how users can set them on access control lists, see Managing Access Control Lists.

  • Enabling users to grant permissions to numerous individual users--a different combination to each individual if desired. UNIX protection distinguishes only between three user or groups: the owner of the file, members of a single specified group, and everyone who can access the local file system.

  • Enabling users to define their own groups of users, recorded in the Protection Database maintained by the Protection Server. The groups then appear on directories' access control lists as though they were individuals, which enables the granting of permissions to many users simultaneously.

  • Enabling system administrators to create groups containing client machine IP addresses to permit access when it originates from the specified client machines. These types of groups are useful when it is necessary to adhere to machine-based licensing restrictions.

The Protection Server's main duty is to help the File Server determine if a user is authorized to access a file in the requested manner. The Protection Server creates a list of all the groups to which the user belongs. The File Server then compares this list to the ACL associated with the file's parent directory. A user thus acquires access both as an individual and as a member of any groups.

The Protection Server also maps usernames (the name typed at the login prompt) to AFS user ID numbers (AFS UIDs). These UIDs are functionally equivalent to UNIX UIDs, but operate in the domain of AFS rather than in the UNIX file system on a machine's local disk. This conversion service is essential because the tokens that the Authentication Server grants to authenticated users are stamped with usernames (to comply with Kerberos standards). The AFS server processes identify users by AFS UID, not by username. Before they can understand whom the token represents, they need the Protection Server to translate the username into an AFS UID. For further discussion of tokens, see A More Detailed Look at Mutual Authentication.

The Volume Server

The Volume Server provides the interface through which you create, delete, move, and replicate volumes, as well as prepare them for archiving to tape or other media (backing up). Volumes explained the advantages gained by storing files in volumes. Creating and deleting volumes are necessary when adding and removing users from the system; volume moves are done for load balancing; and replication enables volume placement on multiple file server machines (for more on replication, see Replication).

The Volume Location (VL) Server

The VL Server maintains a complete list of volume locations in the Volume Location Database (VLDB). When the Cache Manager (see The Cache Manager) begins to fill a file request from an application program, it first contacts the VL Server in order to learn which file server machine currently houses the volume containing the file. The Cache Manager then requests the file from the File Server process running on that file server machine.

The VLDB and VL Server make it possible for AFS to take advantage of the increased system availability gained by using multiple file server machines, because the Cache Manager knows where to find a particular file. Indeed, in a certain sense the VL Server is the keystone of the entire file system--when the information in the VLDB is inaccessible, the Cache Manager cannot retrieve files, even if the File Server processes are working properly. A list of the information stored in the VLDB about each volume is provided in Volume Information in the VLDB.

The Update Server

The Update Server is an optional process that helps guarantee that all file server machines are running the same version of a server process. System performance can be inconsistent if some machines are running one version of the BOS Server (for example) and other machines were running another version.

To ensure that all machines run the same version of a process, install new software on a single file server machine of each system type, called the binary distribution machine for that type. The binary distribution machine runs the server portion of the Update Server, whereas all the other machines of that type run the client portion of the Update Server. The client portions check frequently with the server portion to see if they are running the right version of every process; if not, the client portion retrieves the right version from the binary distribution machine and installs it locally. The system administrator does not need to remember to install new software individually on all the file server machines: the Update Server does it automatically. For more on binary distribution machines, see Binary Distribution Machines.

The Update Server also distributes configuration files that all file server machines need to store on their local disks (for a description of the contents and purpose of these files, see Common Configuration Files in the /usr/afs/etc Directory). As with server process software, the need for consistent system performance demands that all the machines have the same version of these files. The system administrator needs to make changes to these files on one machine only, the cell's system control machine, which runs a server portion of the Update Server. All other machines in the cell run a client portion that accesses the correct versions of these configuration files from the system control machine. Cells running the international edition of AFS do not use a system control machine to distribute configuration files. For more information, see The System Control Machine.

The Backup Server

The Backup Server maintains the information in the Backup Database. The Backup Server and the Backup Database enable administrators to back up data from AFS volumes to tape and restore it from tape to the file system if necessary. The server and database together are referred to as the Backup System.

Administrators initially configure the Backup System by defining sets of volumes to be dumped together and the schedule by which the sets are to be dumped. They also install the system's tape drives and define the drives' Tape Coordinators, which are the processes that control the tape drives.

Once the Backup System is configured, user and system data can be dumped from volumes to tape or disk. In the event that data is ever lost from the system (for example, if a system or disk failure causes data to be lost), administrators can restore the data from tape. If tapes are periodically archived, or saved, data can also be restored to its state at a specific time. Additionally, because Backup System data is difficult to reproduce, the Backup Database itself can be backed up to tape and restored if it ever becomes corrupted. For more information on configuring and using the Backup System, see Configuring the AFS Backup System and Backing Up and Restoring AFS Data.

The Salvager

The Salvager differs from other AFS Servers in that it runs only at selected times. The BOS Server invokes the Salvager when the File Server, Volume Server, or both fail. The Salvager attempts to repair disk corruption that can result from a failure.

As a system administrator, you can also invoke the Salvager as necessary, even if the File Server or Volume Server has not failed. See Salvaging Volumes.

The Network Time Protocol Daemon

The Network Time Protocol Daemon (NTPD) is not an AFS server process per se, but plays an important role. It helps guarantee that all of the file server machines and client machines agree on the time. The NTPD on all file server machines learns the correct time from a parent NTPD source, which may be located inside or outside the cell.

Keeping clocks synchronized is particularly important to the correct operation of AFS's distributed database technology, which coordinates the copies of the Backup, Protection, and Volume Location Databases; see Replicating the OpenAFS Administrative Databases. Client machines may also refer to these clocks for the correct time; therefore, it is less confusing if all file server machines have the same time. For more technical detail about the NTPD, see The NTP web site or the documentation for your operating system.

Clock Skew Impact

Client machines that are authenticating to an OpenAFS cell with valid credentials may still fail when the clocks of the client machine, Kerberos server, and the fileserver machines are not in sync.

Legacy runntp

It is no longer recommended to run the legacy NTPD process called runntp that is part of the OpenAFS suite. Running the NTPD software that comes with your operating system or from www.ntp.org is preferred.

The Cache Manager

As already mentioned in Caching and Callbacks, the Cache Manager is the one component in this section that resides on client machines rather than on file server machines. It is not technically a stand-alone process, but rather a set of extensions or modifications in the client machine's kernel that enable communication with the server processes running on server machines. Its main duty is to translate file requests (made by application programs on client machines) into remote procedure calls (RPCs) to the File Server. (The Cache Manager first contacts the VL Server to find out which File Server currently houses the volume that contains a requested file, as mentioned in The Volume Location (VL) Server). When the Cache Manager receives the requested file, it caches it before passing data on to the application program.

The Cache Manager also tracks the state of files in its cache compared to the version at the File Server by storing the callbacks sent by the File Server. When the File Server breaks a callback, indicating that a file or volume changed, the Cache Manager requests a copy of the new version before providing more data to application programs.