Non-convex Optimization for Machine Learning

Non-convex Optimization for Machine Learning
Title Non-convex Optimization for Machine Learning PDF eBook
Author Prateek Jain
Publisher Foundations and Trends in Machine Learning
Pages 218
Release 2017-12-04
Genre Machine learning
ISBN 9781680833683

Download Non-convex Optimization for Machine Learning Book in PDF, Epub and Kindle

Non-convex Optimization for Machine Learning takes an in-depth look at the basics of non-convex optimization with applications to machine learning. It introduces the rich literature in this area, as well as equips the reader with the tools and techniques needed to apply and analyze simple but powerful procedures for non-convex problems. Non-convex Optimization for Machine Learning is as self-contained as possible while not losing focus of the main topic of non-convex optimization techniques. The monograph initiates the discussion with entire chapters devoted to presenting a tutorial-like treatment of basic concepts in convex analysis and optimization, as well as their non-convex counterparts. The monograph concludes with a look at four interesting applications in the areas of machine learning and signal processing, and exploring how the non-convex optimization techniques introduced earlier can be used to solve these problems. The monograph also contains, for each of the topics discussed, exercises and figures designed to engage the reader, as well as extensive bibliographic notes pointing towards classical works and recent advances. Non-convex Optimization for Machine Learning can be used for a semester-length course on the basics of non-convex optimization with applications to machine learning. On the other hand, it is also possible to cherry pick individual portions, such the chapter on sparse recovery, or the EM algorithm, for inclusion in a broader course. Several courses such as those in machine learning, optimization, and signal processing may benefit from the inclusion of such topics.

First-order and Stochastic Optimization Methods for Machine Learning

First-order and Stochastic Optimization Methods for Machine Learning
Title First-order and Stochastic Optimization Methods for Machine Learning PDF eBook
Author Guanghui Lan
Publisher Springer Nature
Pages 591
Release 2020-05-15
Genre Mathematics
ISBN 3030395685

Download First-order and Stochastic Optimization Methods for Machine Learning Book in PDF, Epub and Kindle

This book covers not only foundational materials but also the most recent progresses made during the past few years on the area of machine learning algorithms. In spite of the intensive research and development in this area, there does not exist a systematic treatment to introduce the fundamental concepts and recent progresses on machine learning algorithms, especially on those based on stochastic optimization methods, randomized algorithms, nonconvex optimization, distributed and online learning, and projection free methods. This book will benefit the broad audience in the area of machine learning, artificial intelligence and mathematical programming community by presenting these recent developments in a tutorial style, starting from the basic building blocks to the most carefully designed and complicated algorithms for machine learning.

Sample-Efficient Nonconvex Optimization Algorithms in Machine Learning and Reinforcement Learning

Sample-Efficient Nonconvex Optimization Algorithms in Machine Learning and Reinforcement Learning
Title Sample-Efficient Nonconvex Optimization Algorithms in Machine Learning and Reinforcement Learning PDF eBook
Author Pan Xu
Publisher
Pages 246
Release 2021
Genre
ISBN

Download Sample-Efficient Nonconvex Optimization Algorithms in Machine Learning and Reinforcement Learning Book in PDF, Epub and Kindle

Machine learning and reinforcement learning have achieved tremendous success in solving problems in various real-world applications. Many modern learning problems boil down to a nonconvex optimization problem, where the objective function is the average or the expectation of some loss function over a finite or infinite dataset. Solving such nonconvex optimization problems, in general, can be NP-hard. Thus one often tackles such a problem through incremental steps based on the nature and the goal of the problem: finding a first-order stationary point, finding a second-order stationary point (or a local optimum), and finding a global optimum. With the size and complexity of the machine learning datasets rapidly increasing, it has become a fundamental challenge to design efficient and scalable machine learning algorithms that can improve the performance in terms of accuracy and save computational cost in terms of sample efficiency at the same time. Though many algorithms based on stochastic gradient descent have been developed and widely studied theoretically and empirically for nonconvex optimization, it has remained an open problem whether we can achieve the optimal sample complexity for finding a first-order stationary point and for finding local optima in nonconvex optimization. In this thesis, we start with the stochastic nested variance reduced gradient (SNVRG) algorithm, which is developed based on stochastic gradient descent methods and variance reduction techniques. We prove that SNVRG achieves the near-optimal convergence rate among its type for finding a first-order stationary point of a nonconvex function. We further build algorithms to efficiently find the local optimum of a nonconvex objective function by examining the curvature information at the stationary point found by SNVRG. With the ultimate goal of finding the global optimum in nonconvex optimization, we then provide a unified framework to analyze the global convergence of stochastic gradient Langevin dynamics-based algorithms for a nonconvex objective function. In the second part of this thesis, we generalize the aforementioned sample-efficient stochastic nonconvex optimization methods to reinforcement learning problems, including policy gradient, actor-critic, and Q-learning. For these problems, we propose novel algorithms and prove that they enjoy state-of-the-art theoretical guarantees on the sample complexity. The works presented in this thesis form an incomplete collection of the recent advances and developments of sample-efficient nonconvex optimization algorithms for both machine learning and reinforcement learning.

Accelerated Optimization for Machine Learning

Accelerated Optimization for Machine Learning
Title Accelerated Optimization for Machine Learning PDF eBook
Author Zhouchen Lin
Publisher Springer Nature
Pages 286
Release 2020-05-29
Genre Computers
ISBN 9811529108

Download Accelerated Optimization for Machine Learning Book in PDF, Epub and Kindle

This book on optimization includes forewords by Michael I. Jordan, Zongben Xu and Zhi-Quan Luo. Machine learning relies heavily on optimization to solve problems with its learning models, and first-order optimization algorithms are the mainstream approaches. The acceleration of first-order optimization algorithms is crucial for the efficiency of machine learning. Written by leading experts in the field, this book provides a comprehensive introduction to, and state-of-the-art review of accelerated first-order optimization algorithms for machine learning. It discusses a variety of methods, including deterministic and stochastic algorithms, where the algorithms can be synchronous or asynchronous, for unconstrained and constrained problems, which can be convex or non-convex. Offering a rich blend of ideas, theories and proofs, the book is up-to-date and self-contained. It is an excellent reference resource for users who are seeking faster optimization algorithms, as well as for graduate students and researchers wanting to grasp the frontiers of optimization in machine learning in a short time.

Optimization for Machine Learning

Optimization for Machine Learning
Title Optimization for Machine Learning PDF eBook
Author Suvrit Sra
Publisher MIT Press
Pages 509
Release 2012
Genre Computers
ISBN 026201646X

Download Optimization for Machine Learning Book in PDF, Epub and Kindle

An up-to-date account of the interplay between optimization and machine learning, accessible to students and researchers in both communities. The interplay between optimization and machine learning is one of the most important developments in modern computational science. Optimization formulations and methods are proving to be vital in designing algorithms to extract essential knowledge from huge volumes of data. Machine learning, however, is not simply a consumer of optimization technology but a rapidly evolving field that is itself generating new optimization ideas. This book captures the state of the art of the interaction between optimization and machine learning in a way that is accessible to researchers in both fields. Optimization approaches have enjoyed prominence in machine learning because of their wide applicability and attractive theoretical properties. The increasing complexity, size, and variety of today's machine learning models call for the reassessment of existing assumptions. This book starts the process of reassessment. It describes the resurgence in novel contexts of established frameworks such as first-order methods, stochastic approximations, convex relaxations, interior-point methods, and proximal methods. It also devotes attention to newer themes such as regularized optimization, robust optimization, gradient and subgradient methods, splitting techniques, and second-order methods. Many of these techniques draw inspiration from other fields, including operations research, theoretical computer science, and subfields of optimization. The book will enrich the ongoing cross-fertilization between the machine learning community and these other fields, and within the broader optimization community.

Nonconvex Optimization and Model Representation with Applications in Control Theory and Machine Learning

Nonconvex Optimization and Model Representation with Applications in Control Theory and Machine Learning
Title Nonconvex Optimization and Model Representation with Applications in Control Theory and Machine Learning PDF eBook
Author Yue Sun
Publisher
Pages 0
Release 2022
Genre Algorithms
ISBN

Download Nonconvex Optimization and Model Representation with Applications in Control Theory and Machine Learning Book in PDF, Epub and Kindle

In control and machine learning, the primary goal is to learn the models that make predictions or decisions and act in the world. This thesis covers two important aspects for control theory and machine learning: the model structure that allows low training and generalization error with few samples (i.e., low sample complexity), and convergence guarantees for first-order optimization algorithms for nonconvex optimization. If the model and the training algorithm apply the knowledge of the structure of data (such as sparsity, low-rankness, etc.), the model can be learned with low sample complexity. We present two results, the Hankel nuclear norm regularization method for learning a low order system, and the overparameterized representation for linear meta-learning. We study dynamical system identification in the first result. We assume the true system order is low. A low system order means that the state can be represented by a low dimensional vector, and the system corresponds to a low rank Hankel matrix. The low-rankness is known to be encouraged by nuclear norm regularized estimator in matrix completion theory. We apply a nuclear norm regularized estimator for Hankel matrix, and show that it requires fewer samples than the ordinary least squares estimator. We study linear meta-learning in the second part. The meta-learning algorithm contains two steps: learning a large model in representation learning stage, and fine tuning the model in few-shot learning stage. The few-shot dataset contains few samples, and to avoid overfitting, we need a fine-tuning algorithm that uses the information from representation learning. We generalize the subspace-based model in prior arts to Gaussian model, and describe the overparameterized meta-learning procedure. We show that the feature-task alignment reduces the sample complexity in representation learning, and the optimal task representation is overparameterized. First order optimization methods such as gradient based method, is widely used in machine learning thanks to its simplicity for implementation and fast convergence. However, the objective function in machine learning can be nonconvex, and the first order method has only the theoretical guarantee that it converges to a stationary point, rather than a local/global minimum. We dive into more refined analysis of the convergence guarantee, and present two results, the convergence of perturbed gradient descent approach to a local minimum on Riemannian manifold, and a unified global convergence result of policy gradient descent for linear system control problems. We study how Riemannian gradient converges to an approximate local minimum in the first part. While it is well-known that the perturbed gradient descent escapes saddle points in Euclidean space, less is known about the concrete convergence rate when we apply Riemannian gradient descent on the manifold. In the first result, we show that the perturbed Riemannian gradient descent converges to an approximate local minimum and reveal the relation between convergence rate and the manifold curvature. We study the policy gradient descent applied in control in the second part. Many control problems are revisited under the context of the recent boom in reinforcement learning (RL), however, there is a gap between the RL and control methodology: The policy gradient in RL applies first-order method on nonconvex landscape, and it is hard to show they converge to global minimum, while control theory invents reparameterization that makes the problem convex and they are proven to find the globally optimal controller in polynomial time. Targeting on interpreting the success of the nonconvex method, in the second result, we connect the nonconvex policy gradient descent applied for a collection of control problems with their convex parameterization, and propose a unified proof for the global convergence of policy gradient descent.

Convex Optimization

Convex Optimization
Title Convex Optimization PDF eBook
Author Stephen P. Boyd
Publisher Cambridge University Press
Pages 744
Release 2004-03-08
Genre Business & Economics
ISBN 9780521833783

Download Convex Optimization Book in PDF, Epub and Kindle

Convex optimization problems arise frequently in many different fields. This book provides a comprehensive introduction to the subject, and shows in detail how such problems can be solved numerically with great efficiency. The book begins with the basic elements of convex sets and functions, and then describes various classes of convex optimization problems. Duality and approximation techniques are then covered, as are statistical estimation techniques. Various geometrical problems are then presented, and there is detailed discussion of unconstrained and constrained minimization problems, and interior-point methods. The focus of the book is on recognizing convex optimization problems and then finding the most appropriate technique for solving them. It contains many worked examples and homework exercises and will appeal to students, researchers and practitioners in fields such as engineering, computer science, mathematics, statistics, finance and economics.