Front Page

Syllabus

Course Overview

This course exposes non-CS students to basic data science and data processing concepts and provides students with tool kits to conduct machine learning-based analysis. The course is designed with the powerful GUI-based data analysis platform called ā€‹Texera Links to an external site..

Adopting workflows in data science represents a new trend toward optimizing project efficiency and enhancing collaboration. These structured processes streamline data analysis and modeling, ensuring consistency and reproducibility in managing complex tasks. By embracing workflows, data scientists can tackle the challenges of data processing and model development more effectively, marking a significant shift towards more systematic and reliable data science practices. 

The course is intended for two purposes:

  1. It introduces students to modern data analysis techniques including data cleaning, data wrangling, data visualization, and machine learning,
  2. It teaches students how to use these tools and technologies to conduct real-world analysis projects. 

Texera-system-screenshot.png


Course Information

Lecture time: M/W 11:00 am - 12:20 pm in SST 220A, in-person lecture only.

Staff Office Hours Email
Instructor: ā€‹Yicong Huang Links to an external site. Fri 4-5pm, ICS 458A yicongh1 AT uci.edu
Assistant: Shengquan Ni Links to an external site. Tue 1-2pm, ICS 458A

shengqun AT uci.edu

Guest Lecturer: Chen Li N/A

chenli AT ics.uci.edu


Lectures

Lecture ID Day Date Topic Resources
1 Mon 04/01/2024 Course Introduction, Syllabus, Introduction to data science

Lecture 1 Introduction.pdf Download Lecture 1 Introduction.pdf

2 Wed 04/03/2024 Relational Data Model, Data Schema

Lecture 2 Data Model, Schema.pdf Download Lecture 2 Data Model, Schema.pdf

3 Mon 04/08/2024 Data Cleaning, Sources, Filter, Projection, Limit, Distinct, Set Operators

Lecture 3 Data Cleaning, Simple Transformations.pdf Download Lecture 3 Data Cleaning, Simple Transformations.pdf

4 Wed 04/10/2024 Exploratory Analysis, Aggregate, Entity-Relation Data Model, Join, Cartesian Product, Sort

Lecture 4 Aggregate, Join, Sort.pdf Download Lecture 4 Aggregate, Join, Sort.pdf

5 Mon 04/15/2024 Exploratory Analysis and Data Visualizations, Quiz 1

Lecture 5 Data Visualizations.pdf Download Lecture 5 Data Visualizations.pdf

6 Wed 04/17/2024 Python and Python UDF Operator

Lecture 6 Python and Python UDF.pdf Download Lecture 6 Python and Python UDF.pdf

7 Mon 04/22/2024 Python UDF Tuple API

Lecture 7 Python UDF Tuple API.pdf Download Lecture 7 Python UDF Tuple API.pdf

8 Wed 04/24/2024 Python UDF Tuple API, Timeseries Analysis

Lecture 8 Timeseries.pdf Download Lecture 8 Timeseries.pdf

9 Mon 04/29/2024 Image Processing

Lecture 9 Image Processing.pdf Download Lecture 9 Image Processing.pdf

10 Wed 05/01/2024 Image Processing (cont.), Visualization, Quiz 2

Lecture 10 Image Processing (cont.).pdf Download Lecture 10 Image Processing (cont.).pdf

11 Mon 05/06/2024 Machine Learning Intro, Supervised Learning, Regression, Linear Regression

Lecture 11 Machine Learning Intro, Supervised Learning, Regression.pdf Download Lecture 11 Machine Learning Intro, Supervised Learning, Regression.pdf

12 Wed 05/08/2024 Classification, Logistic Regression, KNN, Decision Tree, Random Forest

Lecture 12 Classification.pdf Download Lecture 12 Classification.pdf

13 Mon 05/13/2024 Unsupervised Learning, Clustering, Python UDF Table API, Customized Training

Lecture 13 Unsupervised Learning, Clustering, Table API, Customized Training.pdf Download Lecture 13 Unsupervised Learning, Clustering, Table API, Customized Training.pdf

14 Wed  05/15/2024 Natural Language Processing, Sentiment Analysis, Quiz 3

Lecture 14 Natural Language Processing, Sentiment Analysis.pdf Download Lecture 14 Natural Language Processing, Sentiment Analysis.pdf

15 Mon 05/20/2024 Workshop, Case Study

Lecture 15 Capstone Project.pdf Download Lecture 15 Capstone Project.pdf

16 Wed 05/22/2024 Workshop

 

Mon 05/27/2024 Memorial Day. No Lecture

 

17 Wed 05/29/2024 Workshop

 

18 Mon 06/03/2024 Workshop, Quiz 4

 

19 Wed 06/05/2024 Capstone Project Showcase

 


Assignments

Assignment Topic Days Due Date Weight (of the entire course)
1 Tweet Analysis 13 Apr 14, 8 PM 10%
2 Timeseries and Image Data Analysis 13 Apr 28, 8 PM (Task 3 due on May 5th, 8 PM) 10%
3 Data Analysis with ML 11 May 19, 8PM  10%

Capstone Project

Days Due Date Weight (of the entire course)
Project Proposal 5 05/23 5%
Project Capstone Showcase 17 06/05 35%

Quizzes

We will not have midterms or finals. Instead, we will do in-class quizzes during lectures. We will announce quizzes ahead of time.

Date Weight (of the entire course, the lowest one will be dropped)
1 04/15 10%
2 05/01 10%
3 05/15 10%
4 06/03 10%

Survey

You will participate in a survey evaluating the experience of using workflows for data science projects. The participation will account for 3% of your course grade, regardless of your feedback.


Texera Course Service

In this course, we will use Texera Links to an external site. system deeply to illustrate concepts, carry out assignments/projects, and facilitate communication.

 

Online Texera Service

Our team maintains a Texera live service at https://texera-ics80.ics.uci.edu. Please log in with your UCI email using the Google Login feature.

The service will be available to you every day 9 am - 9 pm. We will conduct maintenance outside this time. (Note: this means your execution may be force-killed after 9 pm.) You can check back this dashboard Links to an external site. for the latest status of the service.

Given that the Texera staff maintains this service and it is shared across the entire course, we urge you to handle it responsibly. Do not overwhelm it with large datasets or execute tasks that require a large amount of time (i.e. more than 1 hour). Your activities on the platform will be monitored. Activities including but not limited to crawling, cyber-attacking, and crypto-mining are strictly forbidden.

Texera Get Started Guide Links to an external site..

Account Activation Notice: After logging in with your UCI account, you might find that you are unable to access the dashboard. This is because your account requires manual activation by our team. Please allow for a delay of up to one day for this activation process to be completed.

 

Online Discussion Forum

Texera embeds a discussion forum, you can access it within Texera (backup independent link).  This system is highly catered to getting you help fast and efficiently from classmates, the TA(s), and the instructor. Rather than emailing questions to any of us on the teaching staff, we'll ask you to post your questions on the Forum. Optionally, you can make a private post. Please ensure that the content you post is meaningful and adheres to ethical standards.


Grading Breakdown

Assignments: 30%
Quizzes: 30%
Capstone Project: 40%
Participation in Survey (Optional): extra 3%
Participation in EEE Class Evaluation (Optional): extra 1%

For all the graded projects and exams, if you disagree with the grading, you can discuss them with us within one week after they are returned. After that, all the grades will be finalized.


Prerequisites

This course is designed for individuals who may have little to no experience in coding, so there are no strict CS course prerequisites. The second half of the course will include an introduction to the programming language Python and its rich ecosystem for AI and ML.  To better consume the course materials, we recommend you take ICS 31 and ICS 32 as they provide a basic introduction to programming. 


Textbooks

Many online tutorials.


Working in Teams

  • You will work individually on Assignments and Quizzes.

  • You are expected to work in a team of 2 to collaborate on the Capstone Project. Work in groups will be graded on a per-group basis.


Policy on Academic Honesty

  • All students will be expected to adhere to the UCI and ICS Academic Honesty policies (see https://conduct.uci.edu/students/academic-integrity/index.php for details). Any student found to be involved in cheating or aiding others in doing so will be academically prosecuted to the maximum extent possible: that means you will fail this course. Just say no to cheating!