Introduction to process management

processes

A process is the activity on a system caused by a running program.

UNIX is a multitasking system, which means it has facilities for controlling and tracking multiple jobs or processes at the same time, and ensuring they get their appropriate share of system resources such as CPU time. It is also a multi-user operating system which means that it can simultaneously manage files and processes belonging to more than one user on the same system. Security features in the operating system prevent these processes from interfering with each other. The kernel gives the impression of keeping multiple processes active simultaneously by switching between them more rapidly than the eye can see.

Each process has certain information associated with it including:

Displaying process information: the ps(1) command

Reminder: to display the manual page for ps(1) use the command

man 1 ps

The ps command is used to obtain information about processes on the system. Options to this command include:

Some of the column headings available include:

top(1) is used to display similar information, but in a continuously updated display. ps just displays a one-off snapshot.

Signalling and terminating processes using kill(1)

This is used to send a signal to a process. Signals are numbered using a small integer, starting at 1, with constant names specified in signal.h . kill -l lists these numbers and names:

kill -l
 1) SIGHUP       2) SIGINT       3) SIGQUIT      4) SIGILL
 5) SIGTRAP      6) SIGABRT      7) SIGBUS       8) SIGFPE
 9) SIGKILL     10) SIGUSR1     11) SIGSEGV     12) SIGUSR2
13) SIGPIPE     14) SIGALRM     15) SIGTERM     17) SIGCHLD
(rest of output not shown)

signal(7) documents these signals and the default behaviour of programs which don't take any special measures to handle them. Programs which take special measures to handle signals do so using the system call documented in signal(2).

Sending a SIGHUP to a background server process e.g. the sendmail mail routing daemons, is often used to get these to reread their configuration files. In this example, having updated a sendmail configuration we use ps to find out the PID of the 2 programs providing sendmail service:

-bash-2.05b$ ps -fe | grep sendmail | grep -v grep
root 2143  12:06 sendmail: accepting connections
mail 2163  12:06 sendmail: Queue runner@01:00:00

This shows our incoming and outgoing sendmail PIDs are 2143 and 2163. We can get sendmail to reread its configuration using the command:

kill -s SIGHUP 2143 2163

will send the required signal to these 2 processes to make sure they use the updated configuration. kill -1 sendmail is a shorter alternative.

The SIGKILL ( kill -9 ) signal is intended to provide a sure kill, as this signal to terminate a process can't be caught by it. This generally results in a process being terminated, if the UID of the process sending the signal also owns the process being killed or is root (administrator).

Creating processes using fork(2)

The fork(2) system call is used to copy one process into two seperate processes. These are referred to as the parent, which continues to execute using the same PID and PPID (Parent Process ID), and the child, to which a new PID is allocated and which takes on the PID of the parent as its own PPID. The PID of the child is returned by fork to the parent process, and fork returns 0 to the child.

#include <sys/types.h>
#include <unistd.h>

int main(void){
    pid_t child_pid;
    printf("before fork\n");
    if(child_pid=fork())
        printf("I'm the parent, my child's PID is: %d\n",child_pid);
    else /* fork() returned 0 */
        printf("I'm the child\n");
    printf("after fork\n");
    return 0;
}

Running this program resulted in the following output:

[rich@copsewood c]$ ./fork
before fork
I'm the parent, my child's PID is: 3078
after fork
[rich@copsewood c]$ I'm the child
after fork

As we would expect, there is a single execution path through the code before the fork() and 2 execution paths afterwards. On this run, the parent finished and the shell prompted before the child finished. In this example the parent and child ran asynchronously, both sending their standard output to the same console.

Synchronous and asynchronous process execution

In some cases, for example if the child process is a server or "daemon" ( a process expected to run all the time in the background to deliver services such as mail forwarding) the parent process would not wait for the child to finish. In other cases, e.g. running an interactive command where it is not good design for the parent's and child's output to be mixed up into the same output stream, the parent process, e.g. a shell program, would normally wait for the child to exit before continuing.

N.B. If you run a shell command with an ampersand as it's last argument, e.g. sleep 60 & the parent shell doesn't wait for this child process to finish.

[rich@copsewood c]$ sleep 60 &
[1] 3121
[rich@Harefield c]$ ls
fork.c     fork*     getenv.c   PROJECT.C
[rich@Harefield c]$
[1]+  Done                    sleep 60

In this example the shell job number 1 was allocated to the sleep process. Bash has various builtin commands which are used to control background jobs, using their job number as an alternative identifier to their process ID.

wait1.c: making the parent process wait for the child

The ability to synchronise the parent with the child can be achieved through use of wait and waitpid system calls described in wait(2). In wait1.c, the parent waits for the child to finish before reporting the PID of the child. (Use of a NULL status address tells wait(2) not to bother recording child status data.)

/* wait1.c */
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

int main(void){
    pid_t child_pid;
    int *status=NULL;
    if(fork()){
        /* wait for child, getting  PID */
        child_pid=wait(status);
        printf("I'm the parent.\n");
        printf("My child's PID was: %d\n",child_pid);
    } else {
        printf("I'm the child.\n");
    }
    return 0;
}

Output of wait1.c:

I'm the child.
I'm the parent.
My child's PID was: 3248

wait2.c: getting information about child status

In wait2.c more information about the child is reported to the parent by supplying a non-NULL status address. The status information is then interpreted using macro functions provided by the wait.h header - see wait(2) for details. In this example the child inputs an integer to be used as its return code interactively, and this value is communicated to the parent process. Whether the child returned normally, and the child's return code are reported to the parent using WIFEXITED and WEXITSTATUS macro functions.

/* wait2.c */
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

int main(void){
    int retcode=0; /* 0 returned by parent */
    int status,child_rc; /* child status and return code */
    pid_t child_pid;
    if(fork()){
        /* wait for child, getting status and PID */
        child_pid=wait(&status);
        if(WIFEXITED(status)){
          child_rc=WEXITSTATUS(status);
          printf("I'm the parent.\n");
          printf("My child's PID was: %d\n",child_pid);
          printf("My child's return code was: %d\n",child_rc);
        } else {
          printf("My child didn't exit normally\n");
        }
    } else {
        printf("I'm the child.\n");
        printf("enter my return code:\n");
        scanf("%d",&retcode);
    }
    return retcode;
}

Output of wait2.c

I'm the child.
enter my return code:
4
I'm the parent.
My child's PID was: 3252
My child's return code was: 4

Letting the child run its own program

So far, the parent and child processes have both run in the same source code, using different branches. The exec family of system calls and functions are used to replace the child program image with another program image, which recieves information from the child process, such as its PID, environment and working directory etc. In Linux, this behaviour is provided through the execve(2) system call, which typically uses one of the exec(3) family of functions (execl, execlp, execle, execv, execvp) as a front end. The system call and front-end functions all do the same job. They differ in whether an attempt is made to search the directories in the PATH environment variable for the program to be executed, whether the program arguments are presented as seperate string pointers or as an array of these, and whether the environment is modified.

Function calls to members of the exec() family never return if the call is successful. The child process, having become a different program, terminates when this program finishes.

In the example below, execl() is used to execute the external form of the ls -l /etc command. The program path and its argument list items are provided as string pointers to '\0' terminated strings. (Double quoted strings in 'C' always end with a '\0' .) The final argument to execl() is the NULL pointer. Why specify the program name: /bin/ls twice ? By convention a program is given the path by which it is executed through its first parameter: argv[0] and other arguments are supplied through argv[1], argv[2] etc.

/* exec1.c */
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

int main(void){
    pid_t child_pid;
    int *status=NULL;
    if(fork()){
        child_pid=wait(status);
        printf("I'm the parent.\n");
        printf("My child's PID was: %d\n",child_pid);
    } else { /* child becomes the /bin/ls program */
       execl("/bin/ls","/bin/ls","-l","/etc",NULL);
       printf("call to execl must have failed\n");
    }
    return 0;
}

Output of exec1.c

[rich@Harefield c]$ ./exec1
total 1992
drwxr-xr-x    3 root     root         4096 Mar 28  2003 acpi
(long listing of contents of /etc cut)
-rw-r--r--    1 root     root         1841 Feb 26  2003 zshrc
I'm the parent.
My child's PID was: 3406
[rich@copsewood c]$