4.4 Lab 4: Multiprogrammed Kernel

Lab 4: Multiprogrammed Kernel

Up until now, all of your test programs have executed entirely within the Nachos kernel. In a real operating system, the kernel not only uses its procedures internally, but also executes user programs and allows them to invoke kernel routines via system calls . An executing user program is a process . In this lab you will modify Nachos to support multiple processes, using system calls to request services from the kernel.

Since your kernel does not trust user programs to execute safely, the kernel and the (simulated) hardware will work together to protect the system from damage by malicious or buggy user programs. To this end, you will implement simple versions of key mechanisms found in real operating system kernels: virtual addressing, protected system calls and kernel exception handling, and preemptive timeslicing. Virtual addressing prevents user processes from accessing kernel data structures or the memory of other programs; your kernel will use process page tables to safely allow multiple processes to reside in memory at the same time. With protected system calls and exceptions, all entries into the kernel funnel through a single kernel routine, ExceptionHandler ; you will ``bullet-proof'' this routine so that buggy or malicious user programs cannot cause the kernel to crash or behave inappropriately. Finally, your kernel will use preemptive scheduling to share the simulated CPU fairly among the active user processes, so that no process can take over the system. All of these protection mechanisms require cooperation between the hardware and the operating system kernel software. Your implementation will be based on "hardware" support in the Nachos MIPS simulator, which resembles a real MIPS processor.

The key to Lab 4 is to implement the system calls for process management: the Exec , Exit and Join system calls. New processes are created with Exec : once running as a process, a user program can invoke the Exec system call to create new "child" processes executing other user programs -- or more instantiations of the same program. When the program is finished, it may destroy the containing process by calling Exit . A parent process may call Join to wait for a child process to complete.

If all processes are created by other processes, then who creates the first user process? The operating system kernel creates this process itself as part of its initialization sequence. This is bootstrapping . You can "boot" the Nachos kernel by running nachos with the -x option ( x for "execute"), giving the name of an initial program to run as the initial process. The Nachos release implements the -x option by calling StartProcess in progtest.c to handcraft the initial process and execute the initial program within it. The initial process may then create other processes, which may create other processes...and so on.

How to Get Started

This assignment is difficult, but most of the basic infrastructure is already in place. In particular: (1) the thread system and timer device already support preemptive timeslicing of multiple threads, (2) the thread context switch code already saves and restores MIPS machine registers and the process page table, and (3) the Nachos distribution ( StartProcess ) includes skeletal code to set up a new user process context, load it from an executable file, and start a thread running in it. As with all the programming assignments this semester, you can complete Lab 4 with a few hundred lines of code. The difficult part is figuring out how to build on the Nachos code you already have -- and debugging the result if you don't get it right the first time.

Most of the new files to look at are in the nachos/code/userprog directory. Starting with this lab, you will build your nachos executables in the userprog subdirectory instead of in threads : be sure that you build and run the "right" nachos . Most of the heavy lifting for Labs 4 and 5 is rooted in userprog/exception.cc and addrspace.cc . The Nachos system call interface is defined in Nachos System Call Interface and in userprog/syscall.h . The StartProcess code in progtest.cc is useful as a starting point for implementing Exec . Also, be sure to read The Nachos MIPS Simulator and the header file machine/machine.h that defines your kernel's interface to the simulated machine.

First, you will need basic facilities to load processes into the memory of the simulated machine. Spend a few minutes studying the AddrSpace class, and look at how the StartProcess procedure uses the AddrSpace class methods to create a new process, initialize its memory from an executable file, and start the calling thread running user code in the new process context. The current code for AddrSpace and StartProcess works OK, but it assumes that there is only one program/process running at a time (started with StartProcess from main via the nachos -x option), and that all of the machine's memory is allocated to that process. Your first job is to generalize this code to implement the Exec system call for the general case in which multiple processes are active simultaneously:

Implement a memory manager module to allow your kernel to allocate page frames of the simulated machine's memory for specific processes, and to keep track of which frames are free and which are in use. You may find your Table class from Lab 3 useful here.
Modify AddrSpace to allow multiple processes to be resident in the machine memory at the same time. The default AddrSpace constructor code assumes that all of the machine memory is free, and it loads the new process contiguously starting at page frame 0. You must modify this scheme to use your memory manager to allocate page frames for the new process, and load the process code and data into those allocated page frames, which are not typically contiguous. For now it is acceptable to fail and return an error (0) from Exec if there is not enough free machine memory to load the executable file. Note : to cleanly handle these failures, you will need to move the AddrSpace loading code out of the constructor and into a new AddrSpace method that can return an error. It is poor programming practice to put code that can fail into a class constructor, as the Nachos designers have done in this release.
If necessary, update StartProcess to use your new AddrSpace interface so that you do not break the nachos -x option.
Modify AddrSpace to call the memory manager to release the pages allocated to a process when the process is destroyed. Make sure your AddrSpace code also releases any frames allocated to the process in the case where it discovers that it does not have enough memory to load the entire process.

Next, use these new facilities to implement the Exec and Exit system calls as defined in Nachos System Call Interface and in userprog/syscall.h . If an executing user processes requests a system call (by executing a trap instruction) the machine will transfer control to your kernel by calling ExceptionHandler in exception.cc . Your kernel code must extract the system call identifier and the arguments from the machine registers, decode them, and call internal procedures that implement the system call. Here are some issues to attend to:

When an Exec call returns, your kernel should have created a new process and started a new thread executing within it to run the specified program. However, you do not need to concern yourself with setting up OpenFileIds until the next assignment. For now, you will be able to run user programs, but they will not be able to read any input or write any output.
For Exec , you must copy the filename argument from user memory into kernel memory safely, so that a malicious or buggy user process cannot crash your kernel or violate security. The filename string address ( char* ) passed into the kernel as an argument is a process virtual address; in order for the kernel to access the filename it must locate the characters in the kernel address space (i.e., in the machine's physical "main memory" array) by examining the page table for the process. In particular, your kernel must handle the case where the filename string crosses user page boundaries and resides in noncontiguous physical memory. You must also detect an illegal string address or a string that runs off the end of the user's address space without a terminating null character. Your kernel should handle these cases by returning an error ( SpaceId 0) from the Exec system call.

You may impose a reasonable limit on the maximum size of a file name. Also, use of Machine:ReadMem and Machine:WriteMem is not forbidden as the comment in machine.h implies.

Exec must return a unique process identifier ( SpaceId ), which can be used as an argument to Join , as discussed in Process Management . Your kernel will need to keep a table of active processes. You may find your Table class from Lab 3 useful here.
Handling of the executable file header and sections is a common source of errors. An additional sample code fragment can be found in the aux subdirectory of the Duke Nachos repository. This code is not guaranteed to work for you, but looking at it may help you to avoid some common errors which can be quite difficult to debug.
You may find it convenient to implement process exit as an internal kernel procedure called by the Exit system call handler, rather than calling the lower-level procedures directly from ExceptionHandler . This will make it easier to kill a process from inside the kernel (e.g., if the process has some kind of fatal error), by calling the internal exit primitive from another kernel procedure (e.g., ExceptionHandler ) in the target process context. In general, this kind of careful internal decomposition will save you from reinventing and redebugging wheels, and it is always good practice.

Next, implement the Join system call and other aspects of process management, extending your implementation of Exec and Exit as necessary.

Be sure you handle the argument and result of the Join system call correctly. The kernel Join primitive must validate any SpaceId passed to it by a user process. It must also validate that the calling process has privilege to Join ; in this case, the caller must be the parent of the target process. Finally, your Join implementation must correctly return the exit status code of the target process.
To implement Join correctly and efficiently, you will need to keep a list of all children of each process. This list should be maintained in temporal order, so that you can always determine the most recently created child process. This will be necessary when you implement pipes in Lab 5.
Synchronization between Join and Exit is tricky. Be sure you handle the case where the joinee exits before the joiner executes the Join . Your kernel should also clean up any unneeded process state if Join is never called on some process.
Try to devise the simplest possible synchronization scheme for the code and data structures that manage process relationships and Exit/Join , even if your scheme is inefficient. One possibility might be to use broadcasts on a single condition variable shared by all processes in the system.

Last but not least, it is time to complete the bullet-proofing of your kernel. Implement the Nachos kernel code to handle user program exceptions that are not system calls. The machine (or simulator) raises an exception whenever it is unable to execute the next user instruction, e.g., because of an attempt to reference an illegal address, a privileged or illegal instruction or operand, or an arithmetic underflow or overflow condition. The kernel's role is to handle these exceptions in a reasonable way, i.e., by printing an error message and killing the process rather than crashing the whole system. Note : an ASSERT that crashes Nachos is a reasonable response to a bug within your Nachos kernel, but it not an acceptable response to a user program exception.

Extra credit . Implement the Yield and Fork system calls to allow Nachos user programs to use multiple threads, as defined in Threads .

You should test your code by exercising the new system calls from user processes. To test your kernel, you will create some simple user programs as described in Creating Test Programs for Nachos Kernels . Your Nachos kernel will execute user programs on a simulated MIPS machine as discussed in The Nachos MIPS Simulator .

Write a test program(s) to create a tree of processes M levels deep by calling Exec N times from each parent process, and join on all children before exiting. Since Exec as defined provides no way to pass arguments into the new process, you may hard-code M and N into your test programs as constants, and you may use multiple versions of the programs with different constants encoded within them.
Create a modified version of the tree test program in which each parent process exits without joining on its children. The purpose of this program is to (1) test with larger numbers of processes, and (2) test your kernel's code for cleaning up process state in the no-join case.
Since your kernel allows multiple processes to run at once, you have been careful to employ synchronization where needed inside your kernel. Run your tree test programs with timeslicing enabled (using the Nachos -rs option) to increase the likelihood of exposing any synchronization flaws.