An efficient primal dual prox method for non-smooth optimization
Gespeichert in:
Verfasser / Beitragende:
[Tianbao Yang, Mehrdad Mahdavi, Rong Jin, Shenghuo Zhu]
Ort, Verlag, Jahr:
2015
Enthalten in:
Machine Learning, 98/3(2015-03-01), 369-406
Format:
Artikel (online)
Online Zugang:
| LEADER | caa a22 4500 | ||
|---|---|---|---|
| 001 | 605478023 | ||
| 003 | CHVBK | ||
| 005 | 20210128100403.0 | ||
| 007 | cr unu---uuuuu | ||
| 008 | 210128e20150301xx s 000 0 eng | ||
| 024 | 7 | 0 | |a 10.1007/s10994-014-5436-1 |2 doi |
| 035 | |a (NATIONALLICENCE)springer-10.1007/s10994-014-5436-1 | ||
| 245 | 0 | 3 | |a An efficient primal dual prox method for non-smooth optimization |h [Elektronische Daten] |c [Tianbao Yang, Mehrdad Mahdavi, Rong Jin, Shenghuo Zhu] |
| 520 | 3 | |a We study the non-smooth optimization problems in machine learning, where both the loss function and the regularizer are non-smooth functions. Previous studies on efficient empirical loss minimization assume either a smooth loss function or a strongly convex regularizer, making them unsuitable for non-smooth optimization. We develop a simple yet efficient method for a family of non-smooth optimization problems where the dual form of the loss function is bilinear in primal and dual variables. We cast a non-smooth optimization problem into a minimax optimization problem, and develop a primal dual prox method that solves the minimax optimization problem at a rate of $$O(1/T)$$ O ( 1 / T ) assuming that the proximal step can be efficiently solved, significantly faster than a standard subgradient descent method that has an $$O(1/\sqrt{T})$$ O ( 1 / T ) convergence rate. Our empirical studies verify the efficiency of the proposed method for various non-smooth optimization problems that arise ubiquitously in machine learning by comparing it to the state-of-the-art first order methods. | |
| 540 | |a The Author(s), 2014 | ||
| 690 | 7 | |a Non-smooth optimization |2 nationallicence | |
| 690 | 7 | |a Primal dual method |2 nationallicence | |
| 690 | 7 | |a Convergence rate |2 nationallicence | |
| 690 | 7 | |a Sparsity |2 nationallicence | |
| 690 | 7 | |a Efficiency |2 nationallicence | |
| 700 | 1 | |a Yang |D Tianbao |u NEC Laboratories America, Inc., Cupertino, CA, USA |4 aut | |
| 700 | 1 | |a Mahdavi |D Mehrdad |u Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA |4 aut | |
| 700 | 1 | |a Jin |D Rong |u Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA |4 aut | |
| 700 | 1 | |a Zhu |D Shenghuo |u NEC Laboratories America, Inc., Cupertino, CA, USA |4 aut | |
| 773 | 0 | |t Machine Learning |d Springer US; http://www.springer-ny.com |g 98/3(2015-03-01), 369-406 |x 0885-6125 |q 98:3<369 |1 2015 |2 98 |o 10994 | |
| 856 | 4 | 0 | |u https://doi.org/10.1007/s10994-014-5436-1 |q text/html |z Onlinezugriff via DOI |
| 898 | |a BK010053 |b XK010053 |c XK010000 | ||
| 900 | 7 | |a Metadata rights reserved |b Springer special CC-BY-NC licence |2 nationallicence | |
| 908 | |D 1 |a research-article |2 jats | ||
| 949 | |B NATIONALLICENCE |F NATIONALLICENCE |b NL-springer | ||
| 950 | |B NATIONALLICENCE |P 856 |E 40 |u https://doi.org/10.1007/s10994-014-5436-1 |q text/html |z Onlinezugriff via DOI | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Yang |D Tianbao |u NEC Laboratories America, Inc., Cupertino, CA, USA |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Mahdavi |D Mehrdad |u Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Jin |D Rong |u Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Zhu |D Shenghuo |u NEC Laboratories America, Inc., Cupertino, CA, USA |4 aut | ||
| 950 | |B NATIONALLICENCE |P 773 |E 0- |t Machine Learning |d Springer US; http://www.springer-ny.com |g 98/3(2015-03-01), 369-406 |x 0885-6125 |q 98:3<369 |1 2015 |2 98 |o 10994 | ||