Course Information

  • Class Meeting: Wednesdays & Fridays, 3:30 – 4:45 PM ET
  • InstructorKristin Stephens-Martinez
  • Graduate TA: Han Gong, Chang Xu & Yunzhou (David) Liu
  • Undergraduate TAs: Trailokaya Bajgain, Yume Choi, Neel Gajjar, Enoch Kuan, Jabari Kwesi, Alexandra Medow, Leah Okamura, Bryan Pan, Sona Suryadevara, Shari Tian, Angela Wang
  • Course Box Folder: All material will be made available in the course box folder
  • Course Forum: Ed Discussion (accessible via Sakai)
  • Course Grading: Gradescope (accessible via Sakai)
  • Zoom link is accessible via Sakai
  • Course Videos/Streaming: Panopto folder

Course Description

Data is the new currency. In every walk of life, people leave digital traces, which are stored and analyzed at both individual and population levels, by businesses for improving products and services, by governments for policy-making and national security, and by scientists for advancing the frontiers of human knowledge.

This course serves as an introduction to various aspects of working with data–acquisition, integration, querying, analysis, and visualization–and data of different types–from unstructured text to structured databases. Through lectures and hands-on labs, the course covers both fundamental concepts and computational tools for working with data and applies them to real datasets in a capstone team project.

This course is open to students from both inside and outside computer science. Dealing with data requires more than just computer programming: What do we know about the processes underlying the data? What are the interesting questions to ask about data? What practical impacts can arise from the data? What constitutes ethical uses? Therefore, we also welcome students with analytical backgrounds (e.g., statistics, math) or knowledge in fields that would benefit from data analysis (e.g., social and life science, public policy).

Prerequisites

This course requires basic knowledge of programming (the equivalent of CompSci 101) and statistics. Additionally, each student should have taken at least one of the following (or their equivalent):

  • a 200-level (or above) computer science course;
  • a 100-level (or above) statistics course;
  • a 200-level (or above) math course.

If the prerequisites are not met, students must obtain the consent of the instructor to enroll.

If you have no programming background and want an introductory programming experience that focuses on data science, you should consider taking CompSci116  Foundations of Data Science.