Project 2 – CPU Scheduling

Due: TBA

Assignment background: GWRR Scheduling

The Linux scheduler repeatedly switches between all the running tasks on the system, attempting to give a fair amount of CPU time to each task. Fair-share scheduling is a strategy that shares the CPU among sets of processes according to the groups that own those processes. For example, suppose that Mary, Jane, and Sam belong to different groups and are logged in to a machine that uses fair-share scheduling. Suppose Mary has one runnable process, Jane has three, and Sam has six. Fair-share scheduling views the ten runnable processes in three groups, and each group receives one-third of the CPU cycles to allocate to the processes that it owns. So Mary's single process would get about 33% of the available CPU cycles, each of Jane's three processes would get roughly 11% of the available CPU cycles, and each of Sam's six processes would get about 5.5% of the CPU cycles. Suppose Jim belonged to Jane's group and he logs in to the machine and created two more processes. Then Mary's single process would still get its 33% of the CPU and each of Sam's six processes would still get about 5.5% of the CPU. But Jim and Jane together would have five processes that in total receive one-third of the CPU, so that each process would get about 6.7% of the CPU.

We introduce a new scheduling policy to support fair-share scheduling. We call this policy GWRR, for group weighted round-robin scheduling. GWRR should use fair-share scheduling based on each process's UNIX group identification. At each invocation of the scheduler, we will use a hierarchical scheme to choose which task to run: first, a group is chosen, then, a task within that group's set is chosen. If we allow each group's chosen task the same amount of time on the CPU, each group should be represented equally.

Since we are making two choices in each scheduling decision, we need two algorithms: one to choose the group, and one to choose one of that group's tasks. Let's start by assuming we will use a Round-Robin scheme to decide which group to choose. We keep a queue of groups, and whichever group was chosen last time we scheduled, we choose the next group in the queue for this schedule. We then choose a task within this group's set of tasks (more on this later), and let it run for a fixed amount of time (the time quantum). Now every group gets an equal amount of CPU time.

Now let's say that some groups are more equal than others. Imagine that we associate an integer, which we'll call a weight, with each group, and when a user is chosen, we let the group's chosen tasks run for W time quanta (instead of a single time quantum), where W is the group's weight. In this way we can specify that some groups get more CPU time than others, and also how much more. A group with a weight of 3 will get 50% more CPU time than a group with weight 2. This is called proportional sharing. More specifically, this implementation of proportional sharing is called Weighted Round-Robin.

We still haven't specified how a task is to be chosen once we choose a group. For simplicity, use a simple round-robin (RR) scheduling algorithm to do that. The intragroup RR scheduler should use the same default timeslice as the Linux scheduler uses for a task with the default nice value. Otherwise, the RR scheduler should work the same as GWRR except that it schedules tasks, not groups, and there are no different task weights.

Assignment description

In this assignment, you are asked to implement a new task scheduler GWRR as described above:

Most of the code of interest for this assignment is in kernel/sched.c and include/linux/sched.h. These are probably not the only files you will need to look at, but they're a good start. While there is a fair amount of code in these files, a key goal of this assignment is for you to understand how to abstract the scheduler code so that you learn in detail the parts of the scheduler that are crucial for this assignment and ignore the parts that are not.
Implement group weights. Add two system calls, getgroupweight() and setgroupweight(), with the following prototypes:

int getgroupweight(int gid);

intsetgroupweight(int gid, int weight);

getgroupweight() should return the weight of the specified group, or -1 on error. The default group weight should be 10. setgroupweight() should give the specified group the specified weight, and return 0, or -1 on error.

Your scheduler should operate alongside the existing Linux scheduler. Therefore, you should add a new scheduling policy, SCHED_GWRR. Only tasks whose policy is set to SCHED_GWRR (normally done via the system call sched_setscheduler()) should be considered for selection by your new scheduler. Your scheduler and the existing scheduler should interoperate as follows:

If there are any running tasks whose policy is SCHED_RR or SCHED_FIFO (these are real-time tasks), one of them should be chosen before any other, according to the existing Linux scheduler rules.
If there are any running tasks whose policy is SCHED_OTHER (the default scheduler), one of them should be chosen before any SCHED_GWRR (your scheduler) tasks, according to the default scheduler's rules.
If all running tasks have a policy of SCHED_GWRR, your scheduler should be used to choose a task, according to the rules outlined.

In other words, there should be a strict priority relationship where SCHED_RR and SCHED_FIFO are first, SCHED_OTHER is second, and SCHED_GWRR is last.

Linux uses an array data structure for keeping track of tasks for scheduling. Tasks in this array are indexed by priority. You may find this data structure useful for implementing GWRR. However, your scheduler should be designed to work with any number of groups and should not assume only a fixed number of groups.
The user_struct structure in include/linux/sched.h is used for tracking users. This structure may be helpful in creating a useful way to keep track of user groups.

Create five users Mary, Jane, Sam, Jim, and Pat. Assign Mary and Jane to the same group, Sam and Jim to the same group, and Pat to a separate group. Assign weights of 30, 10, and 10 to the respective groups. Create a simple test program that measures its own CPU usage as it runs and and use it to show that your GWRR scheduler behaves properly. Prepare the tests before the demo and document your results in a writeup submitted with your code and taken to the demo.

Help on Linux administration:
Groups can be added using the command: "groupadd group_name" and users can be added using the command: "useradd user_name".

What to Turn In

You should turn in the following:

The names of all of the Linux kernel source files that you modified in order to add your new scheduler, and a written description of what you did to them and why you needed to do it (i.e. why was it necessary to modify this particular file).
The complete source code of the routine that implements the new scheduler in the kernel (i.e., just the new code you wrote, not the source code that was already in the kernel that got control to your new routine).
The source code of your test program.
A transcript showing you using your new scheduler with your test program.
A brief write-up (about a page) with the answers to the following questions.

Describe how you found the information needed to complete this project. Did it have the information you needed? Did you consult with any humans? If so, what did you try first and who did you consult with?

Explain the calling sequence that makes your scheduler work. First, a user program calls <.....>. Then, <.....> calls <.....>. ... and so on. You can explain this using either text or a rough (less than 15 minutes) diagram.

Do not underestimate the importance of the write-up. Your project grade depends significantly on how well you understood what you were doing, and the write-up is the best way for you to demonstrate that understanding.

Submission instructions: We will be using the submit program.

Get all of your files (including the writeup) together on the CS machines. You may want to create one tar file of all the code.
Run submit_cps210 lab2 <filename₁> <filename₂> ... <filename_n>

Disclaimer: This assignment is adapted from a project by Dr. Kai Shen at University of Rochester.