Duke DBGroup Logo

Data-intensive Computing Systems: Course Schedule

Course information
Course schedule and notes
Assignments
Readings
Project
Extra Materials
The course schedule will be posted here.
WeekDateTopicLecture slides and reference
108-27Introduction and overview Notes 1: pptx, pdf
08-29Introduction to MapReduce and Hadoop Chapter 2 in Tom White's book
Notes 2: ppt, pdf
209-03Algorithms in MapReduce Notes 3: pdf
09-05Guest Lecture by Jie Li
Starfish: A Self-Tuning System for Big Data Analytics
Slides: pptx, pdf
Extra reading: How Hadoop Works Notes 4: ppt, pdf
Exercise 1
309-10 Google FileSystem and MapReduce GFS,
MapReduce
09-12 Google Bigtable Bigtable
409-17 Overview of query processing, Query rewrites Notes 5: ppt, pdf
09-19 Pipelining (iterators) and Materialization, Costing query plans Notes 5: ppt, pdf
509-24 Introduction to Pig Latin Notes 6: ppt, pdf,
Reading 1 on Pig
09-26 Processing Pig Latin queries Notes 6: ppt, pdf,
Reading 2 on Pig,
Exercise 2
610-01Block-based data storage Notes 7: ppt, pdf
10-03 Index-based access Notes 8: ppt, pdf
710-08 Index-based access (contd.) Notes 9: ppt, pdf
10-10 Project proposals
810-15 No class (Fall Break)
10-17 Sort processing Notes 10: ppt, pdf
910-22 Buffer day
10-24 Midterm
1010-29Introduction to Join processing Notes 10: ppt, pdf
10-31Sort-merge joins, Block and Index nested-loop joins, Hash joins Notes 10: ppt, pdf
1111-05Cost-based Query Optimization Notes 11: ppt, pdf
11-07Failure recovery Notes 12: ppt, pdf
1211-12Logging and Checkpointing Notes 12: ppt, pdf
11-14 Concurrency control and Serializability Notes 13: ppt, pdf, Exercises
1311-19 Concurrency control, locking Notes 14: ppt, pdf
Longer version: ppt, pdf
11-21Thanksgiving break
1411-26 Readings
11-28 Readings