CPS 296.1 (Spring 2012):
Project in Computational Journalism

Course Information
Lecture Notes
Schedule and Readings
Sakai (Forum)

You are responsible for completing the reading assignments posted in the schedule below before class. For papers marked with rev, reviews are due by 10pm the day before the class when the papers will be discussed. Each review should discuss the following:

  • At least three important things that the paper says;
  • At least two interesting things that you found in the paper (e.g., a non-obvious pitfall, an uncanny insight, a neat trick that could be used elsewhere);
  • At least one thing that you did not like about the paper.
There is no specific requirement on the length of your reviews. A good, insightful review can be as brief as 400 words. There are some useful tips on reading papers by William G. Griswold.

Post your reviews electronically using the Sakai class forum (you will find the link in the navigational panel to your left). Please post your review for a paper as a reply to my post requesting reviews for it. Your review will be visible to everybody else (and vice versa). Please avoid the temptation of looking at your classmates' reviews before reading the paper yourself; you will learn much more by formulating your own opinions.

Papers marked with opt are optional reading.

Note that many URLs below point to publishers' sites, and you will only be able to access them from a Duke IP address. If you plan to read the papers elsewhere, make sure you download them first while connected to the Duke network.

WeekDate Topic
1 01-13Class cancelled
(instructor is sick)
2 01-18Introduction
01-20 rev "Computational Journalism." Cohen, Hamilton, and Turner. Communications of ACM, 54(10), 2011. [HTML]
"Computational Journalism: A Call to Arms to Database Researchers." Cohen, Li, Yu, Yang. CIDR 2011. [PDF]
3 01-25 rev "News and Information as Digital Media Come of Age." Berkman Center for Internet and Society at Harvard University, 2008. [PDF]
"Shared Values, Clashing Goals." Cohen. ACM Crossroads, December 2011. [HTML]
01-27 "NRCC Overstates Dems' Voting Record with Pelosi." factcheck.org. [HTML]
"ACC Sets the Standard Among BCS Conferences in Latest US News & World Report 'Best Colleges' Rankings." theacc.com. [HTML]
4 02-01Guest lecture by Sarah Cohen
02-03Survey of other projects:
ProConPro.org and Project Vote Smart (Rohit)
PANDA, DocumentCloud, and MemeTracker(Nikhil)
OpenCalais, Time Flow, and Vox Civitas (Andrea)
FactCheck.org and OpenSecrets.org (Will)
Overview and LocalWiki (Yunjia)
5 02-08Survey of other projects:
Mechanical Turk, Tribevine, and projects by Luis von Ahn (Wuzhou)
OpenCongress and Watchdog.net (David)
OpenBlock and ScraperWiki (Mohammad)
Poligraft and Ushahidi (Rozemary)
Boosting for Crowdsourcing (Yuqian)
02-10Guest lecture by Hye-chung Kum:
"Providing the Web of Social Science Knowledge for the Future: A Network of Social Science Data Collaboratories." Karen S. Cook, Gary King, and David Laitin. NSF-SBE White Paper, 2010. [HTML]
6 02-15Collaborative query formulation
02-17Project discussion
7 02-22Project discussion
02-24 rev "Human Computation: A Survey and Taxonomy of a Growing Field." Quinn and Bederson. CHI 2011. [LINK]
rev "Analyzing the Amazon Mechanical Turk Marketplace." Ipeirotis. ACM Crossroads, December 2010. [HTML]
8 02-29 "Information Extraction." Sarawagi. FnT Databases, 1(3), 2008. [PDF]
03-02Same as above
9 03-07Spring recess
03-09Spring recess
10 03-14 rev "CrowdDB: Answering Queries with Crowdsourcing." Franklin, Kossmann, Kraska, Ramesh, and Xin. SIGMOD 2011. [LINK]
"Natural Language Interface to Relational Databases via Crowdsourcing." Alexe and Myers. UCSC Class Report, 2010. [PDF]
03-16 rev "Strategies for Crowdsourcing Social Data Analysis." Willett, Heer, and Agrawala. CHI 2012. [LINK]
"Highlighting Disputed Claims on the Web." Ennals, Trushkowsky, and Agosta. WWW 2010. [LINK]
opt "What is Disputed on the Web?" Ennals, Byler, Agosta, and Rosario. WICOW 2010. [LINK]
11 03-21 rev "Making Database Systems Usable." Jagadish, Chapman, Elkiss, Jayapandian, Li, Nandi, and Yu. SIGMOD 2007. [LINK]
"Why Not?" Chapman and Jagadish. SIGMOD 2009. [LINK]
03-23 "How to ConQueR Why-Not Questions." Tran and Chan. SIGMOD 2010. [LINK]
"Causality in Databases." Meliou, Gatterbauer, Halpern, Koch, Moore, and Suciu. IEEE Data Engineering Bulletin 33(3), 2010. [PDF]
opt "Answering Why-not Questions on Top-k Queries." He and Lo. ICDE 2012. [PDF]
12 03-28 rev "Fighting Spam on Social Web Sites: A Survey of Approaches and Future Challenges." Heymann, Koutrika, Garcia-Molina. IEEE Internet Computing, 11(6), 2007. [LINK]
"Social Information Processing in News Aggregation." Lerman. IEEE Internet Computing, 11(6), 2007. [LINK]
opt Additional reading: "Social Design Patterns for Reputation Systems: An Interview with Yahoo's Bryce Glass." [LINK]
03-30 rev "Query Recommendations for Interactive Database Exploration." Chatzopoulou, Eirinaki, and Polyzotis. SSDBM 2009. [LINK]
"Collaborative Querying using the Query Graph Visualizer." Goh, Fu, and Foo. Online Information Review, 29(3), 2005. [LINK]
opt "SnipSuggest: Context-Aware Autocompletion for SQL." Khoussainova, Kwon, Balazinska, and Suciu. VLDB 2011. [PDF]
13 04-04Project update
04-06Project update
14 04-11 rev "Constructing a Generic Natural Language Interface for an XML Database." Li, Yang, and Jagadish. EDBT 2006. [LINK]
"Explaining Structured Queries in Natural Language." Koutrika, Simitsis, and Ioannidis. ICDE 2010. [LINK]
opt "DaNaLIX: A Domain-Adaptive Natural Language Interface for Querying XML." Li, Chaudhuri, Yang, Singh, and Jagadish. SIGMOD 2007. [LINK]
04-13 "Natural Language Interfaces to Databases: An Introduction." Androutsopoulos, Ritchie, and Thanisch. Natural Language Engineering, 1(1), 1995. [PDF]
15 04-18 "Anomaly Detection: A Survey." Chandola, Banerjee, and Kumar. ACM Computing Surveys 41(3), 2009. [LINK]
04-20(class meets despite graduate reading period)
1604-25Reading period
04-27Reading period
17 05-01Final (no exam): 7-10pm
Last updated Tue Apr 03 14:00:43 EDT 2012