Duke DBGroup Logo

Data-intensive Computing Systems: Assignments

Course information
Course schedule and notes
Assignments
Readings
Project
Extra Materials

Grading Rules for Assignments

  1. Your grade will be zero for summaries and reports if you copy large parts of text as is from any source. You are expected to write your own text and cite ALL sources that you have taken ideas or facts from.
  2. If you get zero grade for two assignments, then your total grade for all assignments will be set to zero.
  3. There is no page limit for summaries and reports.
  4. Submit only PDF documents for assignments where you are asked to submit a report, a presentation, or to solve problems. Other formats will be rejected outright without grading.
  5. The submission instructions are at the bottom of the page.

AssignmentDue DateSolution
Assignment 1: Programming
(Introduction to MapReduce)
Sept. 14, 5:00 PM
Assignment 2: Submit initial summary on scalable
batch analytics systems
(examples: Pig, Hive, HadoopDB,
TeraData, Vertica, etc.)
Sept. 21, 5:00 PM
Assignment 3: Submit initial summary on scalable
real-time analytics systems
(examples: HBase, Cassandra / Brisk,
Storm, S4, Muppet, etc.)
Sept. 28, 5:00 PM
Assignment 4: Problems Sep. 28, 5:00 PM
Assignment 5: Submit initial summary on scalable
graph and matrix processing systems
(examples: Pregel, Hama, SystemML,
SciDB, Spark, GraphLab, etc.)
Oct. 5, 5:00 PM
Assignment 6: Problems Oct. 5, 5:00 PM
Assignment 7: Submit final report on scalable
batch analytics systems
(examples: Pig, Hive, HadoopDB,
TeraData, Vertica, etc.)
Oct. 19, 5:00 PM
Assignment 8: Submit report on scalable
real-time analytics systems
(examples: HBase, Cassandra / Brisk,
Storm, S4, Muppet, etc.)
Oct. 26, 5:00 PM
Project 2 Nov. 2, 5:00 PM
Assignment 9: Problems Nov. 9, 5:00 PM
Assignment 10: Submit report on scalable
graph and matrix processing systems
(examples: Pregel, Hama, SystemML,
SciDB, Spark, GraphLab, etc.)
Nov. 16, 5:00 PM
Exercise 11: Problems Not graded

Submission Instructions for Assignments

The assignments will be submitted via Sakai.

Submit only PDF documents for assignments where you are asked to submit a report, a presentation, or to solve problems. Other formats will be rejected outright without grading.

For any programming assignment, create and submit a .zip or .tar.gz archive that contains the following directory structure:

  1. Create a directory for the assignment, and name it as assignmentN where N is the assignment number. For example, assignment1, assignment2, etc.
  2. For each part of the assignment, create a subdirectory under the main assignment directory, and add the source code and README there. Name the subdirectory as parta for Part A, partb for Part B, etc. For example, the source code and README for Part A of Assignment 1 will be in the directory: assignment1/parta
  3. For each part, write a README file that gives overall high-level documentation of the code and instructions to run the code. We will use the README to understand the code as well as to run it. Your grade will be zero if we cannot understand or run the code.