Project 1: User-level thread library

Fall 2020 notes

This lab is two one-week labs: p1t and p1s. You register once, for p1t "Thread Library", and then submit the same repo for two different deadlines. The AG tests are the same, so you will pass only a subset of the tests in your p1t submissions. The name "p1" is for historical reasons.

Code. Use your Linux docker. The code is in C++. It is wise to avoid new (C++11) features. AG support for these features is uncertain.

Spec. Read the lab spec carefully, all the way through. You must follow the spec very closely to pass the AG tests. The spec describes all requirements for p1t, p1s, and your thread library test suite to be graded with p1s.

p1t. For p1t, you are expected to pass specific AG tests for creating and scheduling threads. The functionality is similar to the Week 3 activity on user-mode context switching with swapcontext. Some of that code may be useful.

p1s. For p1s, you add support for Mesa monitors to your thread library and complete your test suite.

Problem with the AG350 repos. So there is a problem with the ag350 repos and binary files. Please right-click-download thread.o and libinterrupt.a to replace the ones in your AG-assigned pool. If your software makes correctly the fix has worked. Go ahead and checked the modified version of the files into your repo when committing.

Overview

This project leads you through implementing your own version of the thread library.

Thread library interface

This section describes the interface to the thread library for this project.


int thread_libinit(thread_startfunc_t func, void *arg)

thread_libinit initializes the thread library. A user program should call thread_libinit exactly once (before calling any other thread functions). thread_libinit creates and runs the first thread. This first thread is initialized to call the function pointed to by func with the single argument arg. Note that a successful call to thread_libinit will not return to the calling function. Instead, control transfers to func, and the function that calls thread_libinit will never execute again.


int thread_create(thread_startfunc_t func, void *arg)

thread_create is used to create a new thread. When the newly created thread starts, it will call the function pointed to by func and pass it the single argument arg.


int thread_yield(void)

thread_yield causes the current thread to yield the CPU to the next runnable thread. It has no effect if there are no other runnable threads. thread_yield is used to test the thread library. A normal concurrent program should not depend on thread_yield; nor should a normal concurrent program produce incorrect answers if thread_yield calls are inserted arbitrarily.

int thread_lock(unsigned int lock)
int thread_unlock(unsigned int lock)
int thread_wait(unsigned int lock, unsigned int cond)
int thread_signal(unsigned int lock, unsigned int cond)
int thread_broadcast(unsigned int lock, unsigned int cond)

thread_lock, thread_unlock, thread_wait, thread_signal, and thread_broadcast implement Mesa monitors in your thread library. Mesa monitors are presented in lecture.

A lock is identified by an unsigned integer (0 - 0xffffffff). Each lock has a set of condition variables associated with it (numbered 0 - 0xffffffff), so a condition variable is identified uniquely by the tuple (lock number, cond number). Programs can use arbitrary numbers for locks and condition variables (i.e., they need not be numbered from 0 - n).

Each of these functions returns -1 on failure. Each of these functions returns 0 on success, except for thread_libinit, which does not return at all on success.

Here is the file "thread.h". DO NOT MODIFY OR RENAME IT. thread.h will be included by programs that use the thread library, and should also be included by your library implementation.

------------------------------------------------------------------------------- /* * thread.h -- public interface to thread library * * This file should be included in both the thread library and application * programs that use the thread library. */ #ifndef _THREAD_H #define _THREAD_H #define STACK_SIZE 262144/* size of each thread's stack */ typedef void (*thread_startfunc_t) (void *); extern int thread_libinit(thread_startfunc_t func, void *arg); extern int thread_create(thread_startfunc_t func, void *arg); extern int thread_yield(void); extern int thread_lock(unsigned int lock); extern int thread_unlock(unsigned int lock); extern int thread_wait(unsigned int lock, unsigned int cond); extern int thread_signal(unsigned int lock, unsigned int cond); extern int thread_broadcast(unsigned int lock, unsigned int cond); /* * start_preemptions() can be used in testing to configure the generation * of interrupts (which in turn lead to preemptions). * * The sync and async parameters allow several styles of preemptions: * * 1. async = true: generate asynchronous preemptions every 10 ms using * SIGALRM. These are non-deterministic. * * 2. sync = true: generate synchronous, pseudo-random preemptions before * interrupt_disable and after interrupt_enable. You can generate * different (but deterministic) preemption patterns by changing * random_seed. * * start_preemptions() should be called (at most once) in the application * function started by thread_libinit(). Make sure this is after the thread * system is done being initialized. * * If start_preemptions() is not called, no interrupts will be generated. * * The code for start_preemptions is in interrupt.cc, but the declaration * is in thread.h because it's part of the public thread interface. */ extern void start_preemptions(bool async, bool sync, int random_seed); #endif /* _THREAD_H */ -------------------------------------------------------------------------------

start_preemptions() is part of the interrupt library we provide (libinterrupt.a), but its declaration is included as part of the interface that application programs include when using the thread library. Application programs can call start_preemptions() to configure whether (and how) interrupts are generated during the program. As discussed in class, these interrupts can preempt a running thread and start the next ready thread (by calling thread_yield()). If you want to test a program in the presence of these preemptions, have the application program call start_preemptions() once in the beginning of the function started by thread_libinit().

Thread Library

In this part, you will write a library to support multiple threads within a single Linux process. Your library will support all the thread functions described in thread.h.

Creating and swapping threads

You will be implementing your thread library on x86 PCs running the Linux operating system. Linux provides some library calls (getcontext, makecontext, swapcontext) to help implement user-level thread libraries. You will need to read the manual pages for these calls. As a summary, here's how to use these calls to create a new thread:

#include <ucontext.h> /* * Initialize a context structure by copying the current thread's context. */ getcontext(ucontext_ptr); // ucontext_ptr has type (ucontext_t *) /* * Direct the new thread to use a different stack. Your thread library * should allocate STACK_SIZE bytes for each thread's stack. */ char *stack = new char [STACK_SIZE]; ucontext_ptr->uc_stack.ss_sp = stack; ucontext_ptr->uc_stack.ss_size = STACK_SIZE; ucontext_ptr->uc_stack.ss_flags = 0; ucontext_ptr->uc_link = NULL; /* * Direct the new thread to start by calling start(arg1, arg2). */ makecontext(ucontext_ptr, (void (*)()) start, 2, arg1, arg2);

Use swapcontext to save the context of the current thread and switch to the context of another thread. Read the Linux manual pages for more details.

Deleting a thread and exiting the program

A thread finishes when it returns from the function that was specified in thread_create. Remember to de-allocate the memory used for the thread's stack space and context (do this AFTER the thread is really done using it).

When there are no runnable threads in the system (e.g. all threads have finished, or all threads are deadlocked), your thread library should execute the following code:

cout << "Thread library exiting.\n"; exit(0);

Ensuring atomicity

To ensure atomicity of multiple operations, your thread library will enable and disable interrupts. Since this is a user-level thread library, it can't manipulate the hardware interrupt mask. Instead, we provide a library (libinterrupt.a) that simulates software interrupts. Here is the file "interrupt.h", which describes the interface to the interrupt library that your thread library will use. DO NOT MODIFY IT OR RENAME IT. interrupt.h will be included by your thread library (#include "interrupt.h"), but will NOT be included in application programs that use the thread library.

------------------------------------------------------------------------------- /* * interrupt.h -- interface to manipulate simulated hardware interrupts. * * This file should be included in the thread library, but NOT in the * application program that uses the thread library. */ #ifndef _INTERRUPT_H #define _INTERRUPT_H /* * interrupt_disable() and interrupt_enable() simulate the hardware's interrupt * mask. These functions provide a way to make sections of the thread library * code atomic. * * assert_interrupts_disabled() and assert_interrupts_enabled() can be used * as error checks inside the thread library. They will assert (i.e. abort * the program and dump core) if the condition they test for is not met. * * These functions/macros should only be called in the thread library code. * They should NOT be used by the application program that uses the thread * library; application code should use locks to make sections of the code * atomic. */ extern void interrupt_disable(void); extern void interrupt_enable(void); extern "C" {extern int test_set_interrupt(void);} #define assert_interrupts_disabled()\ assert_interrupts_private(__FILE__, __LINE__, true) #define assert_interrupts_enabled()\ assert_interrupts_private(__FILE__, __LINE__, false) /* * assert_interrupts_private is a private function for the interrupt library. * Your thread library should not call it directly. */ extern void assert_interrupts_private(char *, int, bool); #endif /* _INTERRUPT_H */ -------------------------------------------------------------------------------

Note that interrupts should be disabled only when executing in your thread library's code. The code outside your thread library should never execute with interrupts disabled. E.g. the body of a monitor must run with interrupts enabled and use locks to implement mutual exclusion.

Scheduling order

This section describes the specific scheduling order that your thread library must follow. Remember that a correct concurrent program must work for all legal thread interleavings. This restricted scheduling order applies to this specific thread implementation.

All scheduling queues should be FIFO. This includes the ready queue, the queue of threads waiting for a monitor lock, and the queue of threads waiting for a signal. Locks should be acquired by threads in the order in which the locks are requested (by thread_lock() or in thread_wait()).

When a thread calls thread_create, the caller does not yield the CPU. The newly created thread is put on the ready queue but is not executed right away.

When a thread calls thread_unlock, the caller does not yield the CPU. The woken thread is put on the ready queue but is not executed right away.

When a thread calls thread_signal or thread_broadcast, the caller does not yield the CPU. The woken thread is put on the ready queue but is not executed right away. The woken thread requests the lock when it next runs.

Error handling

Operating system code should be robust. There are three sources of errors that OS code should handle. The first and most common source of errors come from misbehaving user programs. Your thread library must detect when a user program misuses thread functions (e.g., calling another thread function before thread_libinit, calling thread_libinit more than once, misusing monitors, a thread that tries to acquire a lock it already has or release a lock it doesn't have, etc.). A second source of error comes from resources that the OS uses, such as hardware devices. Your thread library must detect if one of the lower-level functions it calls returns an error (e.g., C++'s new operator throws an exception because the system is out of memory). For these first two sources of errors, the thread function should detect the error and return -1 to the user program (it should not print any error messages). User programs can then detect the error and retry or exit.

A third source of error is when the OS code itself (in this case, your thread library) has a bug. During development (which includes this entire semester), the best behavior in this case is for the OS to detect the bug quickly and assert (this is called a "panic" in kernel parlance). You should use assertion statements copiously in your thread library to check for bugs in your code. These error checks are essential in debugging concurrent programs, because they help flag error conditions early.

We will not provide you with an exhaustive list of errors that you should catch. OS programmers must have a healthy (?) sense of paranoia to make their system robust, so part of this assignment is thinking of and handling lots of errors. Unfortunately, there will be some errors that are not possible to handle, because the thread library shares the address space with the user program and can thus be corrupted by the user program.

There are certain behaviors that are arguably errors or not. Here is a list of questionable behaviors that should NOT be considered errors: signaling without holding the lock (this is explicitly NOT an error in Mesa monitors); deadlock (however, trying to acquire a lock by a thread that already has the lock IS an error); a thread that exits while still holding a lock (the thread should keep the lock). Ask on the newsgroup if you're unsure whether you should consider a certain behavior an error.

Hint: Autograder test cases 16 and 17 check how well your thread library handles errors.

Managing ucontext structs

Do not use ucontext structs that are created by copying another ucontext struct. Instead, create ucontext structs through getcontext/makecontext, and manage them by passing or storing pointers to ucontext structs, or by passing/storing pointers to structs that contain a ucontext struct (or by passing/storing pointers to structs that contain a pointer to a ucontext struct, but this is overkill). That way the original ucontext struct need never be copied.

Why is it a bad idea to copy a ucontext struct? The answer is that you don't know what's in a ucontext struct. Byte-for-byte copying (e.g., using memcpy) can lead to errors unless you know what's in the struct you're copying. In the case of a ucontext struct, it happens to contain a pointer to itself (viz. to one of its data members). If you copy a ucontext using memcpy, you will copy the value of this pointer, and the NEW copy will point to the OLD copy's data member. If you later deallocate the old copy (e.g., if it was a local variable), then the new copy will point to garbage. Copying structs is also a bad idea for performance (the ucontext struct is 348 bytes on Linux/x86).

Unfortunately, it is rather easy to accidentally copy ucontext structs. Some of the common ways are:

passing a ucontext by value into a function
copying the ucontext struct into an STL queue
declaring a local ucontext variable is almost always a bad idea, since it practically forces you to copy it

You should probably be using "new" to allocate ucontext structs (or the struct containing a ucontext struct). If you use STL to allocate a ucontext struct, make sure that STL class doesn't move its objects around in memory. E.g., using vector to allocate ucontext structs is a bad idea, because vectors will move memory around when they resize.

Example program

Here is a short program that uses the above thread library, along with the output generated by the program. Make sure you understand how the CPU is switching between two threads (both in function loop). "i" is on the stack and so is private to each thread. "g" is a global variable and so is shared among the two threads.

------------------------------------------------------------------------------- #include <stdlib.h> #include <iostream> #include "thread.h" #include <assert.h> using namespace std; int g=0; void loop(void *a) { char *id; int i; id = (char *) a; cout <<"loop called with id " << (char *) id << endl; for (i=0; i<5; i++, g++) { cout << id << ":\t" << i << "\t" << g << endl; if (thread_yield()) { cout << "thread_yield failed\n"; exit(1); } } } void parent(void *a) { int arg; arg = (long int) a; cout << "parent called with arg " << arg << endl; if (thread_create((thread_startfunc_t) loop, (void *) "child thread")) { cout << "thread_create failed\n"; exit(1); } loop( (void *) "parent thread"); } int main() { if (thread_libinit( (thread_startfunc_t) parent, (void *) 100)) { cout << "thread_libinit failed\n"; exit(1); } } ------------------------------------------------------------------------------- parent called with arg 100 loop called with id parent thread parent thread:00 loop called with id child thread child thread:00 parent thread:11 child thread:12 parent thread:23 child thread:24 parent thread:35 child thread:36 parent thread:47 child thread:48 Thread library exiting. -------------------------------------------------------------------------------

Other tips

Start by implementing thread_libinit, thread_create, and thread_yield. Don't worry at first about disabling and enabling interrupts. After you get that system working, implement the monitor functions. Finally, add calls to interrupt_disable() and interrupt_enable() to ensure your library works with arbitrary yield points. A correct concurrent program must work for any legal schedule. In other words, we should be able to insert a call to thread_yield anywhere in your code that interrupts are enabled.

Test cases

An integral (and graded) part of writing your thread library will be to write a suite of test cases to validate any thread library. This is common practice in the real world--software companies maintain a suite of test cases for their programs and use this suite to check the program's correctness after a change. Writing a comprehensive suite of test cases will deepen your understanding of how to use and implement threads, and it will help you a lot as you debug your thread library.

Each test case for the thread library is a short C++ program that uses functions in the thread library (e.g., the example program above). These programs take no inputs: no arguments and no input files. Test cases should exit(0) when run with a correct thread library (normally this happens when your test case's last runnable thread ends or blocks). If you submit your deli simulation as a test case, remember to specify all inputs (number of requesters, buffers, and the list of requests) statically in the program. This shouldn't be too inconvenient because the list of requests should be short to make a good test case (i.e. one that you can trace through what should happen).

If your test case finds that the thread library has some bug, it should report it in its output on stdout (cout, printf). It does not matter exactly what the test case outputs: you win the points iff it generates different output on a buggy thread library than it does on a correct thread library. Your test cases may also generate output on stderr, but the auto-grader ignores stderr. Output files are not permitted: output only to stdout and (optionally) stderr.

Your test cases should NOT call start_preemption(), because we are not evaluating how thoroughly your test suite exercises the interrupt_enable() and interrupt_disable() calls.

Your test suite may contain up to 20 test cases. Each test case may generate at most 10 KB of output and must take less than 60 seconds to run. These limits are much larger than needed for full credit.

Project logistics

Write your thread library in C++ on Linux. The public functions in thread.h are declared "extern", but all other functions and global variables in your thread library should be declared "static" to prevent naming conflicts with programs that link with your thread library.

Compile an application source file (app.cc) into an executable (app) with a thread library (thread.cc) as follows:

g++ -o app thread.cc app.cc libinterrupt.a -ldl -no-pie

Use g++ (/usr/bin/g++) to compile your programs. You may use any functions included in the standard C++ library, including (and especially) the STL. Do not use any libraries other than the standard C++ library.

Your thread library must be in a single file and must be named "thread.cc".

Your repo has copies of thread.h, thread.o, interrupt.h, and libinterrupt.a in your repo. (Note: In Fall 2020, the AG corrupts binary files in your repos, so you must fetch thread.o and libinterrupt.a elsewhere as directed.)

Grading

The results from the auto-grader will not be very illuminating; they won't tell you where your problem is or give you the test programs. The best way to debug your code is to generate your own test cases, figure out the correct answers, and compare your program's output to the correct answers. This is also one of the best ways to learn the concepts in the project.

Your test cases are graded according to coverage, i.e., how thoroughly your suite tests a thread library. The auto-grader first runs a test case with a correct thread library and collects its output on stdout. It discards the test case if it causes a compile or run-time error with a correct thread library. The auto-grader then runs the test case with a set of buggy thread libraries. A test case exposes a buggy thread library if it generates output (on stdout) that differs from its output (on stdout) for a correct thread library. The test suite is graded based on how many of the buggy thread libraries it exposes by at least one test case.

Remember that your test cases should NOT call start_preemption(), because we are not evaluating how thoroughly your test suite exercises the interrupt_enable() and interrupt_disable() calls. The buggy thread libraries will not have problems with interrupt_disable/enable.

Because you are writing concurrent programs, the auto-grader may return non-deterministic results. In particular, test cases 20-24 for the thread library uses asynchronous preemption, which may cause non-deterministic results.

Because your programs are auto-graded, you must be careful to follow the exact rules in the project description:

1) (thread library) The only output your thread library should print is the final output line "Thread library exiting.". Other than this line, the only output should be that generated by the program using your thread library.

2) (thread library) Your thread library should consist of a single file named "thread.cc".

3) Do not modify source code included in this handout (thread.h, interrupt.h).

Turning in your project

Here are the files you should submit for each project part:

1) Thread library (project-part p1t)
A C++ source file for your thread library called "thread.cc". It should support creating and running threads, and yield, which is used for preemption.
2) Synchronization (project-part p1s)
Extensions to your "thread.cc" to implement Mesa monitors.
3) (Due with p1s) A suite of test cases (each test case is a C++ program in a separate file). The name of each test case should be of the form "testX.cc", where X is a number less than 100.

The auto-grader will pull these files from the master branch of your repository.