|
|
The course schedule will be posted here.
Week | Date | Topic | Lecture slides and reference |
1 | 08-27 | Introduction and overview |
Notes 1: pptx,
pdf
|
| 08-29 | Introduction to MapReduce and Hadoop |
Chapter 2 in Tom White's book
Notes 2:
ppt,
pdf
|
2 | 09-03 | Algorithms in MapReduce |
Notes 3:
pdf
|
| 09-05 | Guest Lecture by Jie Li
Starfish: A Self-Tuning System for Big Data Analytics
|
Slides: pptx,
pdf
|
| | Extra reading: How Hadoop Works |
Notes 4: ppt,
pdf
Exercise 1
|
3 | 09-10 |
Google FileSystem and MapReduce |
GFS,
MapReduce
|
| 09-12 |
Google Bigtable |
Bigtable
|
4 | 09-17 |
Overview of query processing, Query rewrites |
Notes 5: ppt,
pdf
|
| 09-19 |
Pipelining (iterators) and Materialization, Costing query plans |
Notes 5: ppt,
pdf
|
5 | 09-24 |
Introduction to Pig Latin |
Notes 6: ppt,
pdf,
Reading 1 on Pig
|
| 09-26 |
Processing Pig Latin queries |
Notes 6: ppt,
pdf,
Reading 2 on Pig,
Exercise 2
|
6 | 10-01 | Block-based data storage |
Notes 7: ppt,
pdf
|
| 10-03 | Index-based access |
Notes 8: ppt,
pdf
|
7 | 10-08 | Index-based access (contd.) |
Notes 9: ppt,
pdf
|
| 10-10 |
Project proposals
|
|
8 | 10-15 |
No class (Fall Break)
|
|
| 10-17 | Sort processing |
Notes 10: ppt,
pdf
|
9 | 10-22 |
Buffer day
|
|
| 10-24 |
Midterm |
|
10 | 10-29 | Introduction to Join processing |
Notes 10: ppt,
pdf
|
| 10-31 | Sort-merge joins, Block and Index nested-loop joins, Hash joins |
Notes 10: ppt,
pdf
|
11 | 11-05 | Cost-based Query Optimization |
Notes 11: ppt,
pdf
|
| 11-07 | Failure recovery |
Notes 12: ppt,
pdf
|
12 | 11-12 | Logging and Checkpointing |
Notes 12: ppt,
pdf
|
| 11-14 |
Concurrency control and Serializability
|
Notes 13: ppt,
pdf,
Exercises
|
13 | 11-19 |
Concurrency control, locking
|
Notes 14: ppt,
pdf
Longer version: ppt,
pdf
|
| 11-21 | Thanksgiving break |
|
14 | 11-26 |
Readings
|
|
| 11-28 |
Readings
|
|
|