Platt Scaling for HSVM: Calibrating Binary to Probabilistic Predictions

Written by hyperbole | Published 2026/01/16
Tech Story Tags: logistic-regression | platt-scaling-algorithm | hsvm-probability-calibration | multiclass-classification-hsvm | machine-scaling | product-optimization | platt-implementation | multiclass-generalization

TLDRThis article includes the original formulation of the HSVM and the convex Relaxation Techniques for Hyperbolic SVMs. It also includes experiments on Moment Sum-of-Squares Relaxation Hierarchy and Platt Scaling 31. The paper is written by John Sheng Yang, John A. Paulson, Cengizhlevan Peihan, John Paulson.via the TL;DR App

Abstract and 1. Introduction

  1. Related Works

  2. Convex Relaxation Techniques for Hyperbolic SVMs

    3.1 Preliminaries

    3.2 Original Formulation of the HSVM

    3.3 Semidefinite Formulation

    3.4 Moment-Sum-of-Squares Relaxation

  3. Experiments

    4.1 Synthetic Dataset

    4.2 Real Dataset

  4. Discussions, Acknowledgements, and References

A. Proofs

B. Solution Extraction in Relaxed Formulation

C. On Moment Sum-of-Squares Relaxation Hierarchy

D. Platt Scaling [31]

E. Detailed Experimental Results

F. Robust Hyperbolic Support Vector Machine

D Platt Scaling [31]

Platt scaling [31] is a common way to calibrate binary predictions to probabilistic predictions in order to generalize binary classification to multiclass classification, which has been widely used along with SVM. The key idea is that once a separator has been trained, an additional logistic regression is fitted on scores of the predictions, which can be interpreted as the closeness to the decision boundary.

In the context of HSVM, suppose w∗ is the linear separator identified by the solver, then we find two scalars, 𝐴, 𝐵 ∈ R, with

where ∗ refers to the Minkowski product defined in Equation (1). The value of 𝐴 and 𝐵 are trained on the trained set using logistic regression with some additional empirical smoothing. For one-vs-rest training, we will then have 𝐾 sets of (𝐴, 𝐵) to train, and at the end we classify a sample to the class with the highest probability. See detailed implementation here https://home.work.caltech.edu/ htlin/program/libsvm/doc/platt.py in LIBSVM.

Authors:

(1) Sheng Yang, John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA ([email protected]);

(2) Peihan Liu, John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA ([email protected]);

(3) Cengiz Pehlevan, John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, Center for Brain Science, Harvard University, Cambridge, MA, and Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA ([email protected]).


This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.


Written by hyperbole | Amplifying words and ideas to separate the ordinary from the extraordinary, making the mundane majestic.
Published by HackerNoon on 2026/01/16