Course Syllabus

CS273A: Introduction to Machine Learning

Prof. Alexander Ihler


How can a machine learn from experience, to become better at a given task? How can we automatically extract knowledge or make sense of massive quantities of data? These are the fundamental questions of machine learning. Machine learning and data mining algorithms use techniques from statistics, optimization, and computer science to create automated systems which can sift through large volumes of data at high speed to make predictions or decisions without human intervention.

Machine learning as a field is now incredibly pervasive, with applications from the web (search, advertisements, and suggestions) to national security, from analyzing biochemical interactions to traffic and emissions to astrophysics. Perhaps most famously, the $1M Netflix prize stirred up interest in learning algorithms in professionals, students, and hobbyists alike; now, websites like Kaggle host regular open competitions on many companies' data.

This class will familiarize you with a broad cross-section of models and algorithms for machine learning, and prepare you for research or industry application of machine learning techniques.


Background

We will assume basic familiarity with the concepts of probability and linear algebra. Some programming will be required; we will primarily use Python, using the libraries "numpy" and "matplotlib", as well as course code.


Textbook and Reading

There is no required textbook for the class. However, useful books on the subject for supplementary reading include Murphy's "Machine Learning: A Probabilistic Perspective", Duda, Hart & Stork, "Pattern Classification", and Hastie, Tibshirani, and Friedman, "The Elements of Statistical Learning".


Python

This year, we will be using Python for most of the programming in the course. I strongly suggest the "full SciPy stack", which includes NumPy, MatPlotLib, SciPy, and iPython notebook for interactive work and visualization; see http://www.scipy.org/install.html for installation information.

Here is a simple introduction to numpy and plotting for the course; and of course you can find complete documentation for these libraries as well as many more tutorial guides online.

I usually use Python 2.7 by default, but try to program in a 3.0 compatible way; if you find parts of the code do not work for more recent versions of Python please let me know the issue and I will try to fix it.


Email & Discussions

We will use a course Piazza page for questions & discussion.  Please post your questions there; you can post privately if you prefer, or if (for example) your question needs to reveal your solution to a homework problem.  I prefer to use Piazza for all class contact, since it enables responses by either myself, the TA, or fellow students (if public), which should get you answers more quickly.


Grading

  • Homeworks 25%, 5 total
  • Project 15%, groups of 2-3
  • Midterm 25%, in class
  • Final 35%

Comprehensive Exam

If you wish to be considered for the comprehensive exam requirement, please let me know by answering "yes" on this "assignment".  If not, there is no need to do anything.


Schedule

Week

Date

Lecture Topic

Slides & Reading

1

Thu 22 Sep

Class setup; Concepts; Bayes optimality

Slides; Python

2

Tue 27 Sep

Bayes classifiers; Naive Bayes

Slides;

 

Thu 29 Sep

Nearest neighbor models

Slides;

3

Tue 4 Oct

Linear regression

Slides;

 

Thu 6 Oct

Linear classifiers; perceptrons, logistic regression

Slides;

4

Tue 11 Oct

Support Vector Machines

 Slides;

 

Thu 13 Oct

VC Dimension

 Slides;

5

Tue 18 Oct

Neural Networks

 Slides;

 

Thu 20 Oct

(catch-up)

 

6

Tue 25 Oct

(catch-up)

 

 

Thu 27 Oct

Review

 

7

Tue 1 Nov

Mid-term Exam

All of above

 

Thu 3 Nov

Decision Trees  Slides;
8

Tue 8 Nov

Ensembles: Bagging, Boosting

 Slides;

 

Thu 10 Nov

Clustering; k-means, EM

 Slides;

9

Tue 15 Nov

Latent space models; SVD

  Slides; 

 

Thu 17 Nov

Collaborative filtering & recommender systems

  Slides;

10

Tue 22 Nov

Markov models

  Slides;

 

Thu 24 Nov

Thanksgiving holiday

 

Fri 25 Nov

Thanksgiving holiday
11

Tue 29 Nov

Markov decision processes

 Slides;

 

Thu 1 Dec

Final Exam Review

All of the above

 12

Wed 7 Dec

Final project deadline

 

Fri 9 Dec

Final exam, 10:30am - 12:30pm

All of the above

 


Academic Honesty

Academic dishonesty is unacceptable and will not be tolerated at the University of California, Irvine. It is the responsibility of each student to be familiar with UCI's current academic honesty policies. Please take the time to read the current UCI Academic Senate Policy On Academic Integrity and the ICS School Policy on Academic Honesty.

The policies in these documents will be adhered to scrupulously. Any student who engages in cheating, forgery, dishonest conduct, plagiarism, or collusion in dishonest activities, will receive an academic evaluation of "F" for the entire course, with a letter of explanation to the student's permanent file. The ICS Student Affairs Office will be involved at every step of the process.  We seek to create a level playing field for all students.

Course Summary:

Date Details Due