Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kaixun Hua

A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets

Nov 12, 2023

Jiayang Ren, Valentín Osuna-Enciso, Morimasa Okamoto, Qiangqiang Mao, Chaojie Ji, Liang Cao, Kaixun Hua, Yankai Cao

Figure 1 for A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets

Figure 2 for A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets

Figure 3 for A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets

Figure 4 for A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets

Abstract:Decision trees are essential yet NP-complete to train, prompting the widespread use of heuristic methods such as CART, which suffers from sub-optimal performance due to its greedy nature. Recently, breakthroughs in finding optimal decision trees have emerged; however, these methods still face significant computational costs and struggle with continuous features in large-scale datasets and deep trees. To address these limitations, we introduce a moving-horizon differential evolution algorithm for classification trees with continuous features (MH-DEOCT). Our approach consists of a discrete tree decoding method that eliminates duplicated searches between adjacent samples, a GPU-accelerated implementation that significantly reduces running time, and a moving-horizon strategy that iteratively trains shallow subtrees at each node to balance the vision and optimizer capability. Comprehensive studies on 68 UCI datasets demonstrate that our approach outperforms the heuristic method CART on training and testing accuracy by an average of 3.44% and 1.71%, respectively. Moreover, these numerical studies empirically demonstrate that MH-DEOCT achieves near-optimal performance (only 0.38% and 0.06% worse than the global optimal method on training and testing, respectively), while it offers remarkable scalability for deep trees (e.g., depth=8) and large-scale datasets (e.g., ten million samples).

* 36 pages (13 pages for the main body, 23 pages for the appendix), 7 figures

Via

Access Paper or Ask Questions

Multilingual Lexical Simplification via Paraphrase Generation

Jul 28, 2023

Kang Liu, Jipeng Qiang, Yun Li, Yunhao Yuan, Yi Zhu, Kaixun Hua

Figure 1 for Multilingual Lexical Simplification via Paraphrase Generation

Figure 2 for Multilingual Lexical Simplification via Paraphrase Generation

Figure 3 for Multilingual Lexical Simplification via Paraphrase Generation

Figure 4 for Multilingual Lexical Simplification via Paraphrase Generation

Abstract:Lexical simplification (LS) methods based on pretrained language models have made remarkable progress, generating potential substitutes for a complex word through analysis of its contextual surroundings. However, these methods require separate pretrained models for different languages and disregard the preservation of sentence meaning. In this paper, we propose a novel multilingual LS method via paraphrase generation, as paraphrases provide diversity in word selection while preserving the sentence's meaning. We regard paraphrasing as a zero-shot translation task within multilingual neural machine translation that supports hundreds of languages. After feeding the input sentence into the encoder of paraphrase modeling, we generate the substitutes based on a novel decoding strategy that concentrates solely on the lexical variations of the complex word. Experimental results demonstrate that our approach surpasses BERT-based methods and zero-shot GPT3-based method significantly on English, Spanish, and Portuguese.

* ECAI 2023

Via

Access Paper or Ask Questions

A Global Optimization Algorithm for K-Center Clustering of One Billion Samples

Dec 30, 2022

Jiayang Ren, Ningning You, Kaixun Hua, Chaojie Ji, Yankai Cao

Abstract:This paper presents a practical global optimization algorithm for the K-center clustering problem, which aims to select K samples as the cluster centers to minimize the maximum within-cluster distance. This algorithm is based on a reduced-space branch and bound scheme and guarantees convergence to the global optimum in a finite number of steps by only branching on the regions of centers. To improve efficiency, we have designed a two-stage decomposable lower bound, the solution of which can be derived in a closed form. In addition, we also propose several acceleration techniques to narrow down the region of centers, including bounds tightening, sample reduction, and parallelization. Extensive studies on synthetic and real-world datasets have demonstrated that our algorithm can solve the K-center problems to global optimal within 4 hours for ten million samples in the serial mode and one billion samples in the parallel mode. Moreover, compared with the state-of-the-art heuristic methods, the global optimum obtained by our algorithm can averagely reduce the objective function by 25.8% on all the synthetic and real-world datasets.

* 34 pages, 6 figures, and 5 tables

Via

Access Paper or Ask Questions

Data ultrametricity and clusterability

Aug 28, 2019

Dan Simovici, Kaixun Hua

Figure 1 for Data ultrametricity and clusterability

Figure 2 for Data ultrametricity and clusterability

Figure 3 for Data ultrametricity and clusterability

Figure 4 for Data ultrametricity and clusterability

Abstract:The increasing needs of clustering massive datasets and the high cost of running clustering algorithms poses difficult problems for users. In this context it is important to determine if a data set is clusterable, that is, it may be partitioned efficiently into well-differentiated groups containing similar objects. We approach data clusterability from an ultrametric-based perspective. A novel approach to determine the ultrametricity of a dataset is proposed via a special type of matrix product, which allows us to evaluate the clusterability of the dataset. Furthermore, we show that by applying our technique to a dissimilarity space will generate the sub-dominant ultrametric of the dissimilarity.

Via

Access Paper or Ask Questions