Source: xkcd
Quick Links:
News:
- [4/13/2023] Homework 4 (handout, starter code) has been released! The assignment is due April 27.
- [3/28/2023] Homework 3 (handout, starter code) has been released! The assignment is due April 11. As with the previous homeworks, you should submit both your PDF writeup and your code on Gradescope; there will be separate assignments for each.
Some problems in computer science admit precise algorithmic solutions. Checking if someone is in a national park is, in some sense, straightforward: get the user’s location, get the boundaries of all national parks, and check if the user location lies within any of those boundaries.
Other problems are less straightforward.
Suppose you want your computer to determine if an image contains a bird.
To your computer, an image is just a matrix of red, green, and blue pixels.
How do you even begin to write the function is_bird(image)
?
For problems like this, we turn to a powerful family of methods known as machine learning. The zen of machine learning is the following:
- I don’t know how to solve my problem.
- But I can obtain a dataset that describes what I want my computer to do.
- So, I will write a program that learns the desired behavior from the data.
This class will provide a broad introduction to machine learning. We will start with supervised learning, where our goal is to learn an input-to-output mapping given a set of correct input-output pairs. Next, we will study unsupervised learning, which seeks to identify hidden structure in data. Finally, we will cover reinforcement learning, in which an agent (e.g., a robot) learns from observations it makes as it explores the world.
Course Staff
Robin Jia
Instructor
Ting-Yun (Charlotte)
Chang
Teaching Assistant
Qinyuan Ye
Teaching Assistant
Aman Bansal
Course Producer
Zhihan Wang
Course Producer
Phakawat
Wangkriangkri
Course Producer
Logistics
- Office hours: See calendar.
- Assignments: Assignments should be submitted through Gradescope. Feedback will also be provided on Gradescope. All enrolled students should be in Gradescope automatically–let me know if you are not!
- Discussions: We will be using Ed (sign-up link) for general course-related questions. If you have an individual matter to discuss, email me directly (please put “CSCI 467” in the subject line) or come to my office hours. For grading questions, go to the office hours of the person who graded the problem in question.
Prerequisites
- Algorithms: CSCI 270
- Linear Algebra: MATH 225
- Probability: EE 364 or MATH 407 or BUAD 310
This class will also use some basic multivariate calculus (taking partial derivatives and gradients). However, knowledge of single-variable calculus is sufficient as we will introduce the required material during class and section.
Schedule
All assignments are due by 11:59pm on the indicated date.
Date | Topic | Related Readings | Assignments | |
---|---|---|---|---|
Tue Jan 10 | Introduction (slides pptx pdf) | PML 1 | Homework 0 released (code) | |
Thu Jan 12 | Linear Regression (pdf, code) | PML 7.8, 8.2 | ||
Fri Jan 13 | Section: Probability, Linear Algebra, & Calculus Review (handout) | |||
Tue Jan 17 | Featurization, Convexity, Normal Equations (pdf) | PML 2.6.3, 8.1, 11.1-11.2 | ||
Thu Jan 19 | Maximum Likelihood Estimation, Logistic Regression (pdf) | PML 4.2, 10.1-10.2 | Homework 0 due | |
Fri Jan 20 | Section: Python & numpy tutorial (colab) | |||
Tue Jan 24 | Softmax Regression, Second-order optimization (pdf) | PML 8.3, 10.3 | ||
Thu Jan 26 | Regularization, Bias and Variance (pdf) | PML 4.5, 4.7, 11.3-11.4 | Homework 1 released (handout, starter code) | |
Fri Jan 27 | Section: Homework 0 Discussion | |||
Tue Jan 31 | Generative Classifiers, Naive Bayes (pdf) | PML 9.3-9.4 | ||
Thu Feb 2 | Naive Bayes continued, Nearest Neighbors (pdf) | PML 16.1, 16.3 | ||
Fri Feb 3 | Section: Cross-Validation, Evaluation Metrics (pdf) | |||
Tue Feb 7 | Kernel methods (pdf) | PML 17.1 | Homework 1 due | |
Thu Feb 9 | Kernels Continued, Project Discussion (pdf) | PML 4.3, 17.3 | ||
Fri Feb 10 | Section: Matrices & Eigenvectors, Course Review | |||
Tue Feb 14 | Support Vector Machines (pdf), ML Libraries (demo) | PML 5.4 | ||
Thu Feb 16 | Introduction to Neural Networks, Dropout, Early Stopping (pdf) | PML 13.1-13.3 | Project Proposal due, Homework 2 released (handout, starter code) | |
Fri Feb 17 | Section: Homework 1 Discussion | |||
Tue Feb 21 | Backpropagation (pdf, code part 1 part 2 part 3) | PML 13.4-13.5 | ||
Thu Feb 23 | Convolutional Neural Networks (pdf) | PML 14.1-14.2 | ||
Fri Feb 24 | Section: Pytorch tutorial (colab) | |||
Tue Feb 28 | Recurrent Neural Networks, Attention (pdf) | PML 15.1-15.2 | ||
Thu Mar 2 | Transformers, Pretraining (pdf) | PML 15.4-15.7 | Homework 2 due | |
Fri Mar 3 | Section: Midterm preparation (slides) | |||
Tue Mar 7 | Decision Trees, Ensembling (pdf) | PML 18.1-18.5 | ||
Thu Mar 9 | In-class Midterm Exam | |||
Mar 10-17 | No class or section (Spring break) | |||
Tue Mar 21 | k-Means Clustering, Start of Gaussian Mixture Models (pdf) | PML 21.3 | ||
Thu Mar 23 | Gaussian Mixture Models, Expectation Maximization (pdf) | PML 21.4, PML2 8.1-8.2 | Project Midterm Report due | |
Fri Mar 24 | Section: Midterm Exam Discussion | |||
Tue Mar 28 | Inference in Hidden Markov Models (pdf) | PML2 29.1-29.4 | Homework 3 released (handout, starter code) | |
Thu Mar 30 | Learning HMMs, Dimensionality Reduction, Principal Component Analysis (pdf) | PML 20.1, 20.4 | ||
Fri Mar 31 | Section: Optimization strategies for neural networks (slides) | |||
Tue Apr 4 | Embedding models, Word Vectors (pca pdf, wordvec slides) | PML 20.5 | ||
Thu Apr 6 | Multi-Armed Bandits (pdf) | PML2 34.1-34.4 | ||
Fri Apr 7 | Section: Practical guide to pretrained language models (slides, code) | |||
Tue Apr 11 | Markov Decision Processes, Reinforcement Learning (pdf) | PML2 34.5-34.6, 35.1, 35.4 | Homework 3 due | |
Thu Apr 13 | Q-Learning with Function Approximation, Policy Gradient (pdf) | PML2 35.2-35.3 | Homework 4 released (handout, starter code) | |
Fri Apr 14 | Section: Practical guide to computer vision models (slides, code) | |||
Tue Apr 18 | Robustness, Adversarial Examples, Spurious Correlations (pdf) | PML2 19.1-19.8 | ||
Thu Apr 20 | Fairness in Machine Learning (pdf) | FAML 1-4 | ||
Fri Apr 21 | Section: Final Exam preparation (slides) | |||
Tue Apr 25 | How does ChatGPT work? (pdf) | |||
Thu Apr 27 | Conclusion (slides) | Homework 4 due | ||
Fri Apr 28 | No section (End of class) | |||
Thu May 4 | Final Exam, 2-4pm | Project Final Report due May 9 |
Grading
Grades will be based on homework assignments (40%), a class project (20%), and two exams (40%).
Homework Assignments (40% total):
- Homework 0: 4%
- Homeworks 1-4: 9% each
Final Project (20% total). The final project will proceed in three stages:
- Project proposal: 2%
- Midterm report: 3%
- Final project report: 15%
Exams (40% total):
- In-class midterm: 15%
- Final exam (cumulative): 25%
Late days
You have 6 late days you may use on any assignment excluding the Project Final Report. Each late day allows you to submit the assignment 24 hours later than the original deadline. You may use a maximum of 3 late days per assignment. If you are working in a group for the project, submitting the project proposal or midterm report one day late means that each member of the group spends a late day. We do not allow use of late days for the final project report because we must grade the projects in time to submit final course grades.
If you have used up all your late days and submit an assignment late, you will lose 10% of your grade on that assignment for each day late. We will not accept any assignments more than 3 days late.
Final project
The final project can be done individually or in groups of up to 3. This is your chance to freely explore machine learning methods and how they can be applied to a task of our choice. You will also learn about best practices for developing machine learning methods—inspecting your data, establishing baselines, and analyzing your errors.
More information about the final project will be released at a later date. A list of example projects is now available at here.
Resources
While there is no required textbook for this class, you may find the following useful:
- Probabilistic Machine Learning: An Introduction (PML) and Probabilistic Machine Learning: Advanced Topics (PML2) by Kevin Murphy. You may also find PML Chapters 2-3 and 7 useful for reviewing prerequisites.
- The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani, and Jerome Friedman.
- Patterns, Predictions, and Actions: A Story about Machine Learning by Moritz Hardt and Benjamin Recht
- Fairness and Machine Learning: Limitations and Opportunities (FAML) by Solon Barocas, Moritz Hardt, and Arvind Narayanan.
To review mathematical background material, you may also find the following useful:
- Probability: Introduction to Probability by Joseph Blitzstein and Jessica Hwang. Most relevant reading: Chapters 1-5, 7, 9-10.
- Linear Algebra: Introduction to Applied Linear Algebra by Stephen Boyd and Lieven Vandenberghe. Most relevant reading: Chapters 1-3, 5-8, 10-11. (Chapters 4 and 12-14 overlap with content for this class.)
- Multivariate Calculus: Oliver Knill’s lecture notes.
Other Notes
Collaboration policy and academic integrity: Our goal is to maintain an optimal learning environment. You may discuss the homework problems at a high level with other students, but you should not look at another student’s solutions. Trying to find solutions online or from any other sources for any homework or project is prohibited, will result in zero grade and will be reported. Using AI tools to automatically generate solutions to written or programming problems is also prohibited. To prevent any future plagiarism, uploading any material from the course (your solutions, quizzes etc.) on the internet is prohibited, and any violations will also be reported. Please be considerate, and help us help everyone get the best out of this course.
Please remember the expectations set forth in the USC Student Handbook. General principles of academic honesty include the concept of respect for the intellectual property of others, the expectation that individual work will be submitted unless otherwise allowed by an instructor, and the obligations both to protect one’s own academic work from misuse by others as well as to avoid using another’s work as one’s own. All students are expected to understand and abide by these principles. Suspicion of academic dishonesty may lead to a referral to the Office of Academic Integrity for further review.
Students with disabilities: Any student requesting academic accommodations based on a disability is required to register with Disability Services and Programs (DSP) each semester. A letter of verification for approved accommodations can be obtained from DSP. Please be sure the letter is delivered to the instructor as early in the semester as possible.