Course Syllabus

 

Date Lecture Topic Relevant Resources
January 6 Course overview and plan (Minin) lecture1_introduction.pdf
January 8 Dusting off your databases (Li) lecture02-li (pptx)
January 13 Data wrangling concepts and issues lecture03-li (pptx)
January 15 Wrangling with Pandas and Dataframes I lecuture04-li (ipynb), Files
January 20
January 22 Wrangling with Pandas and Dataframes II lecuture05-li (ipynb), Files (ditto)
January 27 Data analytics using GUI-based workflows lecture06-li
January 29

Postgres, Twitter, and Tweepy

lecture07-li (ipynb)

February 3 Exploratory data analysis and data visualization I lecture8_dataviz.pdf
February 5 Exploratory data analysis and data visualization II lecture9_dataviz.pdf
February 10 Clustering

clustering_demo-1.ipynb

iris.csv

February 12 Clustering and PCA

housing_data.csv

pca_demo.ipynb

ISLR_unsupervized_learning.pdf

February 17 no class
February 19 Supervised learning and regression

regression_demo.ipynb

ISLR_regression_classification.pdf

February 24 Resampling methods ISLR_resampling.pdf
February 26 Project idea meetings
March 2 Project planning meetings
March 4 Project planning meetings
March 9 Oral project proposal meetings
March 11 Oral project proposal presentations

Assignments, Projects, and Grading

Winter Grading Criteria (for 170A)

Homework: 40%
Project proposal: 50%
Class participation: 10%

Late homeworks will not be graded - please submit whatever you have completed by the homework deadline. 

A single grade will be assigned at the end of Spring quarter for this class, with 50% weight on the Winter grade and 50% on the Spring grade.

Homework and Class Participation

The first quarter will involve a mix of lectures and homework assignments intended to dust off, sharpen, or introduce the skills, tools, and techniques that you will need to successfully execute your course project. Since you are now seniors, and this is your Data Science grand finale, individual initiative and engagement will be expected of all students. The homework assignments may be "looser" than what you are used to -- you will have to seek out some of the information needed to complete the assignments and to make choices about how to attack some of the challenges -- i.e., spoon feeding will be kept to a minimum. The lectures will aim for interactivity, and class participation will be encouraged (and in fact expected).

Academic Honesty Policy

Students will be expected to adhere to the UCI and ICS Academic Honesty policies (see http://www.editor.uci.edu/catalogue/appx/appx.2.htm#academic and http://www.ics.uci.edu/ugrad/policies/index.php#academic_honesty to read their details). Any student found to somehow be involved in cheating or aiding others in doing so will be academically prosecuted to the maximum extent possible: that means that you could fail this course in its entirety. (Ask around - it's happened.) Just say no to cheating!


Software Platform(s)

This course will make use of the Python ecosystem, including the Python language, various Python packages/tools for data analysis and machine learning, Jupyter notebooks, and open source databases (PostgreSQL). For convenience and package completeness, students are advised to download the most recent Anaconda distribution of Python and friends (https://www.anaconda.com/download/) and the most recent EDB distribution of PostgreSQL (https://www.enterprisedb.com/downloads/postgres-postgresql-downloads).

Course Summary:

Date Details Due