Week | Date | Topic | Slides and reference* |
1 | 01-08 | Introduction and overview |
Notes 1 |
2 | 01-13 | Introduction to Frequent itemsets |
Notes 2 |
| 01-15 | Apriori algorithm |
Notes 2
|
3 | 01-20 | Extensions to Apriori |
Notes 3 |
| 01-22 | Maximal and Closed Itemsets |
Notes 4 |
4 | 01-27 | Toivonen's algorithm, FP-trees |
Notes 4 |
| 01-29 | FP-growth, rule generation |
Notes 4 |
5 | 02-03 | Constraint-based mining
Programming Project 1 announced
|
Notes 5 |
| 02-05 | Introduction to Data Warehousing |
Notes 6 |
6 | 02-10 | Data cubes |
Notes 6 |
| 02-12 | Indexes in data warehouses |
Notes 7
|
7 | 02-17 | MOLAP, Multi-dimensional arrays, compression |
Notes 8
Cube computation paper
|
| 02-19 | Midterm |
|
8 | 02-24 | Clustering large datasets (overview) |
Notes 9
Lecture notes
|
| 02-26 | Materialized views in data cubes |
HRU Paper
|
9 | 03-03 | Multi-way MOLAP algorithm for Cube Computation |
Cube computation paper
|
| 03-05 | Cube computation (contd.) |
Notes 8
|
10 | 03-10 | Spring break |
|
| 03-12 | Spring break |
|
11 | 03-17 | Clustering large datasets (distance metrics, k-means) |
Notes 9
Lecture notes
|
| 03-19 | Clustering large datasets (BFR algorithm) |
Notes 9
Lecture notes
|
12 | 03-24 | Wrap up of clustering large datasets |
Notes 9
Lecture notes
|
| 03-26 | Google's initial system architecture (overview) |
Google paper
|
13 | 03-31 | No class |
|
| 04-02 | Google's initial system architecture (PageRank) |
Google paper
|
14 | 04-07 | Google's initial system architecture (System and search) |
Notes 10
|
| 04-09 | The PageRank Citation Ranking (The theory behind PageRank) |
PageRank paper
|
15 | 04-14 | The PageRank Citation Ranking (Computation, crawling) |
PageRank paper
|
| 04-16 | Internet Advertising and Click-Fraud Detection |
Tuzhilin paper
Notes 11
|
14 | 04-21 | Wrap up |
|
16 | 04-27 | Finals -- 2.00-5.00 PM |
|