CS 769 - Optimization in Machine Learning

Course content
  • Continuous and Discrete Optimization are two important pillars of machine learning. Continuous optimization typically occurs in learning model parameters while discrete optimization problems appear in inference and auxiliary tasks such as feature selection, data subset selection, model compression etc.
  • In the first part of this course, we will cover continuous optimization with applications in machine learning.
  • Topics to discuss will include Convexity, Gradient Methods, Proximal algorithms, Stochastic and Online Variants of mentioned methods, Coordinate Descent Methods, Subgradient Methods, Non-Convex Optimization, FrankWolfe, Accelerated Methods, Lagrange and Fenchel Duality, Second-Order Methods, Quasi-Newton Methods, Gradient-Free and Zero-Order Optimization.
  • We will also cover some advanced topics including non-convex optimization, alternating minimization algorithms and parallel/distributed variants of algorithms.
  • We will ground all these algorithms with applications and loss functions in machine learning (starting from logistic regression, linear regression, svms, matrix factorization right up to deep neural networks).
  • Summarily, in Continuous Optimization we will cover topics such as Basics of Continuous Optimization, Convexity, Gradient Descent, Projected/Proximal GD, Subgradient Descent, Accelerated Gradient Descent, Newton & Quasi Newton, Duality: Legrange, Fenchel, Coordinate Descent, Frank Wolfe, Continuous Optimization in Practice, etc.
  • The second part of this course will cover the fundamentals of discrete optimization.
  • We will start with basic forms of combinatorial optimization (knapsack, s-t cuts/paths, matchings and matroids) and then discuss submodular functions and their applications.
  • We will cover algorithms for submodular minimization and maximization under various combinatorial constraints.
  • We will also discuss discrete probabilistic models such as Determinantal Point Processes (DPPs).
  • We will ground all these algorithmic and theoretical results with real world applications in feature selection, summarization and diversified search, structured prediction, data subset selection and model compression.
  • Summarily, in Discrete Optimization we will cover topics such as Linear Cost Problems with Applications in ML, Matroids, Spanning Trees, s-t paths, s-t cuts, Matchings, Covers (Set Covers, Vertex Covers, Edge Covers), Non-Linear Discrete Optimization, Submodular Functions and Applications in Machine Learning, Submodularity and Convexity, Submodular Minimization, Submodular Maximization, Beyond Minimization and Maximization: Other forms of Submodular Optimization, Discrete Optimization in Practice
References
Pre-requisite : N/A
Total credits : 6 credits - Lecture
Type : Core Course
Duration : Full Semester
Name(s) of other Academic units to whom the course may be relevant : N/A