In Fall 2023, we are offering a course on optimization targeted towards applications in machine learning and in data science. The targeted audience for this course are master’s and Ph.D. students in computer science, mathematics, data science and electrical engineering.

**Time:** Mondays & Wednesdays, 2:30 pm - 3:45 pm
**Location:** Atkins 126

Optimization formulations and algorithms play a central role in modern data science, statistics and machine learning. This graduate-level course is an introduction to the basics of continuous optimization, with an emphasis on techniques that are most relevant to data science and machine learning and are amenable to large-scale implementations. We also discuss basic convergence properties and computational aspects of relevant optimization techniques. The relationship and applications of optimization to common machine learning models such as (generalized) principal component analysis, sparse and logistic regression, regularized empirical risk minimization and the training of deep neural networks is another main focus of this course. In addition to theory and algorithms, common optimization modeling software and computer solvers will be introduced to students.

- Learn how to apply optimization algorithms to fit, train and evaluate machine learning models
- Understand the fundamental statistical and computational challenges in models motivating optimiza- tion
- Identify and understand the differences between linear, convex, non-convex, smooth and non-smooth optimization problems to select appropriate solution algorithms
- Develop a solid understanding of the basics of continuous optimization
- Understand the trade-offs and limitations of different numerical optimization methods
- Learn how to work with optimization modeling software and solvers.
- Understand research papers at the intersection of optimization and data science/ machine learning
- Gain practical experience implementing state-of-the-art customized optimization algorithms

**Data Models and Applications in Machine Learning**

- Unsupervised Learning: Principal Component Analysis (w/ review of some linear algebra)
- Supervised Learning: Logistic Regression, Sparse Regression, Matrix Completion, Recommender Systems, Computer Vision
- Training of Deep Neural Networks
- Optimal Transport: Image Matching, Shape Registration

**Algorithms and Theory of Continuous Optimization**

- Basics of Convex Analysis & Optimization
- Multidimensional Calculus: Gradients, Optimality Conditions for Unconstrained Optimization • Gradient Methods
- Autodifferentiation and Backpropagation
- Accelerated Methods
- Coordinate Descent Methods
- Stochastic Gradient Descent
- Subgradient Methods
- Proximal Gradient Methods
- Natural Gradient Methods
- Newton/Quasi-Newton Methods
- Frank-Wolfe
- Cubic Regularized Newton’s Method
- Minimax Optimization