(Bundle Methods for Regularized Risk Minimization)

version 2.1

19 February 2009


BMRM is an open source, modular and scalable convex solver for many machine learning problems cast in the form of regularized risk minimization problem [1]. It is "modular" because the (problem-specific) loss function module is decoupled from the (regularization-specific) optimization module (e.g. quadratic programming or linear programming solvers), thus shorten the time to implement/prototype solutions to new problems. Besides, the decoupling leads to easier parallelization of the loss function computation. At the moment, BMRM can solve the following problems:

  1. Binary classification
  2. Univariate regression
  3. Novelty detection (1-class SVM) [11]
  4. Quantile regression [12]
  5. Poisson regression [13]
  6. Ranking
  7. Graph Matching [16]
  8. Sequence Segmentation and Classification [17]

along with either L1 or L2 regularizer. Also, users can add new loss function for problems with structured input and output variables.


BMRM version 2.1

Older versions

BMRM version 1.0


BMRM is licensed under Mozilla Public License version 1.1. The authors are not responsible for any implications from the use of the software.


Choon Hui Teo | Quoc Le | Alex Smola | SVN Vishwanathan


[1] C. H. Teo, Q. Le, A. J. Smola and S. V. N. Vishwanathan, A Scalable Modular Convex Solver for Regularized Risk Minimization, KDD, 2007. [pdf]
[2] K. P. Bennett and O. L. Mangasarian, Robust Linear Programming Discrimination of Two Linearly Inseparable Sets, Optimization Methods and Software, 1:23-24, 1992.
[3] O. Chappelle, Training a Support Vector Machine in the Primal, Neural Computation, 2007.
[4] M. Collins, R. E. Schapire and Y. Singer, Logistic regression, AdaBoost and Bregman distances, COLT, 2000.
[5] R. Cowell, A. David, S. Lauritzen and D. Spiegelhalter, Probabilistic Networks and Expert Systems, Springer, New York, 1999.
[6] T. Joachims, A Support Vector Method for Multivariate Performance Measures, ICML, 2005.
[7] T. Joachims, Training linear SVMs in linear time, KDD, 2006.
[8] V. Vapnik, S. Golowich and A. J. Smola, Support Vector Method for Function Approximation, Regression Estimation, and Signal Processing, NIPS, 1997.
[9] K.-R. Mueller, A. J. Smola, G. Raetsch, B. Schoelkopf, J. Kphlmorgen and V. Vapnik, Predicting Time Series with Support Vector Machines, ICANN, 1997.
[10] C. K. I. Williams, Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond, M. I. Jordan, editor, Learning and Inference in Graphical Models, 1998.
[11] B. Schoelkopf, R. C. Williamson, A. J. Smola, J. Shawe-Taylor and J. Platt, Support Vector Method for Novelty Detection, NIPS, 2000.
[12] R. Koenker, Quantile Regression, Cambridge University Press, 2005.
[13] N. A. C. Cressie, Statistics for Spatial Data, John Wiley and Sons, New York, 1993.
[14] Q. Le and A. J. Smola, Direct Optimization of Ranking Measures, JMLR, submitted. 2007.
[15] R. Herbrich, T. Graepel and K. Obermayer, Large Margin Rank Boundries for Ordinal Regression, Advanced in Large Margin Classifiers, MIT Press, MA, 2000.
[16] T. S. Caetano, J. J. McAuley, L. Cheng, Q. V. Le, and A. J. Smola, Learning Graph Matching, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009.
[17] Q. Shi, L. Wang, L. Cheng, and A. J. Smola, Discriminative Human Action Segmentation and Recognition using Semi-Markov Models, CVPR, 2008.

Last modified: 19 February 2009