|
|
The course schedule will be posted here.
Week | Date | Topic | Lecture slides and reference |
1 | 08-29 | Introduction and overview |
Notes 1: pptx,
pdf
|
| 08-31 | Introduction to MapReduce and Hadoop |
Chapter 2 in Tom White's book
|
2 | 09-05 | Working with MapReduce |
Notes 2:
pdf
|
| 09-07 | Working with MapReduce (contd.) |
Notes 3:
ppt,
pdf
|
3 | 09-12 | How Hadoop Works |
Notes 4: ppt,
pdf
|
| 09-14 | How Hadoop Works (contd.) |
Notes 4: ppt,
pdf
Exercise 1
|
4 | 09-19 |
Overview of query processing |
Notes 5: ppt,
pdf
|
| 09-21 | Query rewrites,
Pipelining (iterators) and Materialization |
Notes 5: ppt,
pdf
|
5 | 09-26 | Guest Lecture by Herodotos Herodotou
Starfish: A Self-Tuning System for Big Data Analytics
|
Slides: pptx,
pdf
|
| 09-28 | Costing query plans,
Introduction to Pig Latin |
Notes 6: ppt,
pdf,
Reading 1 on Pig
|
6 | 10-03 |
Processing Pig Latin queries |
Notes 6: ppt,
pdf
|
| 10-05 |
Processing Pig Latin queries |
Notes 6: ppt,
pdf,
Reading 2 on Pig
|
7 | 10-10 |
No class (Fall Break)
|
|
| 10-12 |
Processing Pig Latin queries |
Reading 2 on Pig
|
8 | 10-17 | Block-based data storage |
Notes 7: ppt,
pdf
|
| 10-19 | Index-based access |
Notes 8: ppt,
pdf
|
9 | 10-24 | Index-based access (contd.) |
Notes 9: ppt,
pdf
|
| 10-26 |
Midterm |
|
10 | 10-31 |
No class |
|
| 11-02 | Sort processing |
Notes 10: ppt,
pdf
|
11 | 11-07 | Introduction to Join processing |
Notes 10: ppt,
pdf
|
| 11-09 | Sort-merge joins, Block and Index nested-loop joins, Hash joins |
Notes 10: ppt,
pdf
|
12 | 11-14 | Cost-based Query Optimization |
Notes 11: ppt,
pdf
|
| 11-14 | Talk by Jeffrey Krone |
Slides: ppt,
pdf
|
| 11-15 | Talk by Alan Gates |
Slides: pptx,
pdf
|
| 11-16 | Failure recovery, Logging |
Notes 12: ppt,
pdf
|
13 | 11-21 |
Checkpointing, Concurrency control, and Serializability
|
Notes 13: ppt,
pdf,
Exercises
|
| 11-23 | Thanksgiving break |
|
14 | 11-28 |
Concurrency control, locking
|
Notes 14: ppt,
pdf
|
| 11-30 |
Discussion on readings
|
|
|