|
|
Systems
-
[Abhishek Dubey]
HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Avi Silberschatz, Alex Rasin. VLDB 2009.
Project page
-
[Abhishek Dubey]
Tim Kaldewey, Eugene J. Shekita, Sandeep Tata: Clydesdale: structured data processing on MapReduce. EDBT 2012
pdf
-
[Hao Ran Liu]
Vinayak R. Borkar, Michael J. Carey, Raman Grover, Nicola Onose, Rares Vernica: Hyracks: A flexible and extensible foundation for data-intensive computing. ICDE 2011
Project page
-
[Yuxuan Dai]
Yingyi Bu, Vinayak R. Borkar, Michael J. Carey, Joshua Rosen, Neoklis Polyzotis, Tyson Condie, Markus Weimer, Raghu Ramakrishnan: Scaling Datalog for Machine Learning on Big Data. CoRR abs/1203.0160: (2012)
HTML
-
[Yi Ding]
Tenzing: A SQL Implementation On the MapReduce Framework.
Biswapesh Chattopadhyay, Liang Lin, Weiran Liu, Sagar Mittal, Prathyusha Aragonda, Vera Lychagina, Younghee Kwon, Michael Wong. VLDB 2011
HTML,
A open-source
system inspired by Tenzing
-
[Hui Dong]
Shark and Spark projects. Project page
-
[Lanceton Mark Dsouza]
M3R: Increased performance for in-memory Hadoop jobs,
by Avraham Shinnar (IBM Research), David Cunningham (IBM Research),
Benjamin Herta (IBM Research), Vijay Saraswat (IBM Research), VLDB 2012
pdf,
video
-
[Hao Guo]
BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data
Project page
-
[Wei He]
Stratosphere system for information management
Project page
-
[Shenghao Li]
Sailfish: A Framework For Large Scale Data Processing
Project page
-
[Xia Li]
YARN: Hadoop NextGen MapReduce
Project page
-
[Pengfei Ma]
Mesos: Dynamic Resource Sharing for Clusters
Project page
Row Vs. Column Storage
-
[Mayuresh Kunjir]
Avrilia Floratou, Jignesh M. Patel, Eugene J. Shekita, Sandeep Tata: Column-Oriented Storage Techniques for MapReduce. VLDB 2011
pdf
-
[Mayuresh Kunjir]
Alekh Jindal, Jorge-Arnulfo Quiane-Ruiz, Jens Dittrich
Trojan Data Layouts: Right Shoes for a Running Elephant.
SOCC 2011
pdf,
Project page
Indexing for MapReduce
-
[Austin Alexander]
Jens Dittrich, Jorge-Arnulfo Quiane-Ruiz, Stefan Richter, Stefan Schuh, Alekh Jindal, Jorg Schad
Only Aggressive Elephants are Fast Elephants
VLDB 2012
pdf,
Project page
-
[Hanxiao Mao]
Jens Dittrich, Jorge-Arnulfo Quiane-Ruiz, Alekh Jindal, Yagiz Kargin, Vinay Setty, and Jorg Schad
Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)
VLDB 2010
pdf,
Project page
Query Processing
-
[Harsha Ravi]
H. Herodotou and S. Babu.
Profiling, What-if Analysis, and Cost-based Optimization of MapReduce Programs.
VLDB 2011
pdf,
Project page
-
[Le Qi]
Spyros Blanas, Jignesh M. Patel, Vuk Ercegovac, Jun Rao, Eugene J. Shekita, Yuanyuan Tian. A comparison of join algorithms for log processing in MapReduce. SIGMOD 2010
pdf
-
[Yao Rong]
Query Optimization for Massively Parallel Data Processing.
Sai Wu, Feng Li, Sharad Mehrotra,
Beng Chin Ooi. SOCC 2011
pdf,
Project page
-
[Jiawei Shi]
ReStore: Reusing Results of MapReduce Jobs.
Iman Elghandour, Ashraf Aboulnaga, VLDB 2012
HTML
-
[Yuvraj Singh]
Iterative processing extensions to MapReduce.
-
[Shiyuan Wang]
YSmart: An SQL-to-MapReduce Translator
Project page
-
[Tianxu Wang]
Statistics for data stored in parallel systems
Data co-location, Compression, and Serialization
-
[Yinan Xie]
Data serialization formats
-
Mohamed Y. Eltabakh, Yuanyuan Tian, Fatma Ozcan, Rainer Gemulla, Aljoscha Krettek, John McPherson: CoHadoop: Flexible Data Placement and Its Exploitation in Hadoop. VLDB 2011
pdf
-
[Hang Yin]
Data compression
Adaptive Execution
-
[Ran Zhang]
-
Rares Vernica, Andrey Balmin, Kevin S. Beyer, Vuk Ercegovac: Adaptive MapReduce using situation-aware mappers. EDBT 2012
pdf
-
Reoptimizing Data Parallel Computing.
Sameer Agarwal, Srikanth Kandula, Nico Bruno,
Ming-Chuan Wu, Ion Stoica, Jingren Zhou. NSDI 2012
pdf, Find slides and video
here
Hadoop Schedulers
-
[Donghe Zhao]
Overview,
FIFO scheduler,
Fair share scheduler,
Capacity Scheduler,
Dynamic proportional sharing
MapReduce on Multicore and GPU
-
[Yifei Ding]
Phoenix and extensions
-
[Yifei Ding]
Metis
-
[Xi He]
Mars: A MapReduce Framework on Graphics Processors
Hadoop-Database Connectors
-
[Chenbo Zhu]
-
Oracle Big Data Connectors
-
Fatma Ozcan, David Hoa, Kevin S. Beyer, Andrey Balmin, Chuan Jie Liu, Yu Li: Emerging trends in the enterprise data analytics: connecting Hadoop and DB2 warehouse. SIGMOD 2011
HTML
-
Sqoop
-
HiHo
Improving on HDFS
-
Real-time Processing
-
[Yuzhang Han]
Twitter's Storm
-
[Yuzhang Han]
WalmartLabs' Muppet
-
|