Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Seung Jun Shin

Department of Statistics, Korea University, Seoul, Republic of Korea

Scaling Up ROC-Optimizing Support Vector Machines

Nov 07, 2025

Gimun Bae, Seung Jun Shin

Abstract:The ROC-SVM, originally proposed by Rakotomamonjy, directly maximizes the area under the ROC curve (AUC) and has become an attractive alternative of the conventional binary classification under the presence of class imbalance. However, its practical use is limited by high computational cost, as training involves evaluating all $O(n^2)$. To overcome this limitation, we develop a scalable variant of the ROC-SVM that leverages incomplete U-statistics, thereby substantially reducing computational complexity. We further extend the framework to nonlinear classification through a low-rank kernel approximation, enabling efficient training in reproducing kernel Hilbert spaces. Theoretical analysis establishes an error bound that justifies the proposed approximation, and empirical results on both synthetic and real datasets demonstrate that the proposed method achieves comparable AUC performance to the original ROC-SVM with drastically reduced training time.

* 15 pages, Submitted to Stat

Via

Access Paper or Ask Questions

The R package psvmSDR: A Unified Algorithm for Sufficient Dimension Reduction via Principal Machines

Sep 03, 2024

Jungmin Shin, Seung Jun Shin, Andrea Artemiou

Figure 1 for The R package psvmSDR: A Unified Algorithm for Sufficient Dimension Reduction via Principal Machines

Figure 2 for The R package psvmSDR: A Unified Algorithm for Sufficient Dimension Reduction via Principal Machines

Figure 3 for The R package psvmSDR: A Unified Algorithm for Sufficient Dimension Reduction via Principal Machines

Figure 4 for The R package psvmSDR: A Unified Algorithm for Sufficient Dimension Reduction via Principal Machines

Abstract:Sufficient dimension reduction (SDR), which seeks a lower-dimensional subspace of the predictors containing regression or classification information has been popular in a machine learning community. In this work, we present a new R software package psvmSDR that implements a new class of SDR estimators, which we call the principal machine (PM) generalized from the principal support vector machine (PSVM). The package covers both linear and nonlinear SDR and provides a function applicable to realtime update scenarios. The package implements the descent algorithm for the PMs to efficiently compute the SDR estimators in various situations. This easy-to-use package will be an attractive alternative to the dr R package that implements classical SDR methods.

Via

Access Paper or Ask Questions

A least distance estimator for a multivariate regression model using deep neural networks

Jan 06, 2024

Jungmin Shin, Seung Jun Shin, Sungwan Bang

Abstract:We propose a deep neural network (DNN) based least distance (LD) estimator (DNN-LD) for a multivariate regression problem, addressing the limitations of the conventional methods. Due to the flexibility of a DNN structure, both linear and nonlinear conditional mean functions can be easily modeled, and a multivariate regression model can be realized by simply adding extra nodes at the output layer. The proposed method is more efficient in capturing the dependency structure among responses than the least squares loss, and robust to outliers. In addition, we consider $L_1$-type penalization for variable selection, crucial in analyzing high-dimensional data. Namely, we propose what we call (A)GDNN-LD estimator that enjoys variable selection and model estimation simultaneously, by applying the (adaptive) group Lasso penalty to weight parameters in the DNN structure. For the computation, we propose a quadratic smoothing approximation method to facilitate optimizing the non-smooth objective function based on the least distance loss. The simulation studies and a real data analysis demonstrate the promising performance of the proposed method.

* Submitted to 'Journal of Statistical Computation and Simulation'

Via

Access Paper or Ask Questions

A gradient-based variable selection for binary classification in reproducing kernel Hilbert space

Sep 29, 2021

Jongkyeong Kang, Seung Jun Shin

Figure 1 for A gradient-based variable selection for binary classification in reproducing kernel Hilbert space

Figure 2 for A gradient-based variable selection for binary classification in reproducing kernel Hilbert space

Figure 3 for A gradient-based variable selection for binary classification in reproducing kernel Hilbert space

Figure 4 for A gradient-based variable selection for binary classification in reproducing kernel Hilbert space

Abstract:Variable selection is essential in high-dimensional data analysis. Although various variable selection methods have been developed, most rely on the linear model assumption. This article proposes a nonparametric variable selection method for the large-margin classifier defined by reproducing the kernel Hilbert space (RKHS). we propose a gradient-based representation of the large-margin classifier and then regularize the gradient functions by the group-lasso penalty to obtain sparse gradients that naturally lead to the variable selection. The groupwise-majorization-decent algorithm (GMD, Yang and Zou, 2015) is proposed to efficiently solve the proposed problem with a large number of parameters. We employ the strong sequential rule (Tibshirani et al., 2012) to facilitate the tuning procedure. The selection consistency of the proposed method is established by obtaining the risk bound of the estimated classifier and its gradient. Finally, we demonstrate the promising performance of the proposed method through simulations and real data illustration.

* 22 pages, 2 figures

Via

Access Paper or Ask Questions