|
|
See this document for the project schedule
and guidelines.
Guidelines for the project proposal:
Project proposal is due on Oct 24, by midnight. The proposal will be graded
and should include (i) a description of the problem, (ii) the motivation
for the problem (e.g., why is the problem interesting, why is
it challenging, who will benefit from a solution to the problem,
etc.), (iii) your initial ideas on how to attack the
problem, and (iv) a brief discussion of previous work related to this problem.
There is no page limit for the proposal.
Further Readings
Here are some further readings for each of the project topics. To
access some of the following links (e.g., papers in the ACM
digital library), you need to be on the Duke Network.
Flash Memory in Database Systems
Query Optimization in Database Systems
-
Guy Lohman's talk on Self-Managing DB2
with an overview of their recent work on query optimization.
-
The Picasso project and a
related paper.
-
As we discussed in class, the goals of query optimization have changed
over the years. Here is a paper
on robust query optimization.
-
The following paper is the first technical paper on the LEO system
that Volker Markl talked about.
Michael Stillger, Guy M. Lohman, Volker Markl, Mokhtar Kandil: LEO -
DB2's LEarning Optimizer. Available
here.
A new and improved version of this paper is available
here.
-
A less technical, but more forward looking paper, on the LEO
project appeared in the
IBM Systems Journal. Available
here.
Adaptive Query Processing in Database Systems
-
A recent paper on changing query plans if a problem is detected when
a query is running:
Volker Markl, Vijayshankar Raman, David E. Simmen, Guy M. Lohman,
Hamid Pirahesh: Robust Query Processing through Progressive
Optimization. Available
here.
-
An attempt by Shivnath and colleagues
to correct some problems with the above approach:
Proactive Re-optimization.
Query Execution in Database Systems
-
A paper on Interaction-Aware Query Processing and Scheduling.
-
A paper on query suspension and resumption.
-
A paper on estimating time to completion of a query plan.
Data Stream Systems
-
Two recent projects on building data stream management systems:
STREAM
and Aurora.
Here are two overview papers: from STREAM and
from Aurora.
-
Adaptive query processing in a data stream management system. Shivnath's
slides on adaptive query processing and an
overview paper.
-
Work on load shedding which gracefully tackles high stream arrival rates
by reducing the accuracy of query results:
paper 1,
paper 2,
paper 3.
Configuration of Database Systems: Physical Design (e.g., Indexes and
Materialized Views))
Configuration of Database Systems: Resources and Configuration Parameters
-
A paper
from IBM on automated configuration of application servers.
-
A paper on our project at Duke on
Active and Accelerated
Learning of Cost Models for Optimizing Scientific Applications; with
extensions to web services, database servers, storage servers, etc.
-
IBM DB2's Configuration Advisor.
Databases + Information Retrieval (DB+IR)
-
A paper on Google's system architecture. The paper is outdated, but the
basic principles remain.
-
Some papers from IBM on the DB+IR problem:
paper 1,
paper 2.
Self-Healing Database Systems
-
A paper from Oracle on quick identification of
performance problems.
-
Work from IBM on automated scheduling of statistics updates for DB2:
paper 1,
paper
2 (a non-technical
article).
-
A paper from IBM on identifying distinct symptoms for
different causes of DB2 failures.
Project Resources
Here are installation instructions for DB2
on the Duke CS research cluster.
Some useful information on running DB2 on Duke CS research clusters is available from CPS116 web site
|