Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Li Huang

HUGS: Guiding Unified Dexterous Grasp Synthesis Across Modes and Scales via Learned Human Priors

Jul 06, 2026

Mingrui Yu, Yongpeng Jiang, Yongyi Jia, Kangchen Lv, Li Huang, Yi Ren, Xiang Li

Abstract:Dexterous grasping across diverse object scales requires contact modes ranging from two-finger pinches to bimanual grasps. Existing dexterous grasp synthesis methods reduce the high-dimensional optimization space with manually designed expected contacts and initialization heuristics, which struggle to balance synthesis success rate and diversity. We present HUGS (Human-prior-guided Unified Dexterous Grasp Synthesis), a human-prior-guided framework for unified dexterous grasp synthesis across modes and scales. Instead of directly retargeting human demonstrations, HUGS learns an object-conditioned human prior that captures human grasp preferences and guides downstream force-closure-aware optimization. The prior is trained on a compact self-collected human grasp dataset with 1.8K grasps over 304 objects, providing broad coverage of object scales and contact modes. During synthesis, HUGS adaptively proposes contact modes and wrist initializations, substantially improving the balance between contact-mode coverage and synthesis success rate over heuristic-based methods. With HUGS, we synthesize 3.2M robotic grasps over 157K scenes, spanning object half-diagonal lengths from 2 cm to 30 cm and modes from two-finger to bimanual grasps. Models trained on the synthesized dataset autonomously select appropriate contact modes in the real world, enabling grasping from screws to large boxes.

* The first two authors contributed equally. Project website: https://hugs-dex.github.io/

Via

Access Paper or Ask Questions

DeepGuard: Secure Code Generation via Multi-Layer Semantic Aggregation

Apr 10, 2026

Li Huang, Zhongxin Liu, Yifan Wu, Tao Yin, Dong Li, Jichao Bi, Nankun Mu, Hongyu Zhang, Meng Yan

Abstract:Large Language Models (LLMs) for code generation can replicate insecure patterns from their training data. To mitigate this, a common strategy for security hardening is to fine-tune models using supervision derived from the final transformer layer. However, this design may suffer from a final-layer bottleneck: vulnerability-discriminative cues can be distributed across layers and become less detectable near the output representations optimized for next-token prediction. To diagnose this issue, we perform layer-wise linear probing. We observe that vulnerability-related signals are most detectable in a band of intermediate-to-upper layers yet attenuate toward the final layers. Motivated by this observation, we introduce DeepGuard, a framework that leverages distributed security-relevant cues by aggregating representations from multiple upper layers via an attention-based module. The aggregated signal powers a dedicated security analyzer within a multi-objective training objective that balances security enhancement and functional correctness, and further supports a lightweight inference-time steering strategy. Extensive experiments across five code LLMs demonstrate that DeepGuard improves the secure-and-correct generation rate by an average of 11.9% over strong baselines such as SVEN. It also preserves functional correctness while exhibiting generalization to held-out vulnerability types. Our code is public at https://github.com/unknownhl/DeepGuard.

* ACL 2026 main conference

Via

Access Paper or Ask Questions

OmniTabBench: Mapping the Empirical Frontiers of GBDTs, Neural Networks, and Foundation Models for Tabular Data at Scale

Apr 08, 2026

Dihong Jiang, Ruoqi Cao, Zhiyuan Dang, Li Huang, Qingsong Zhang, Zhiyu Wang, Shihao Piao, Shenggao Zhu, Jianlong Chang, Zhouchen Lin(+1 more)

Abstract:While traditional tree-based ensemble methods have long dominated tabular tasks, deep neural networks and emerging foundation models have challenged this primacy, yet no consensus exists on a universally superior paradigm. Existing benchmarks typically contain fewer than 100 datasets, raising concerns about evaluation sufficiency and potential selection biases. To address these limitations, we introduce OmniTabBench, the largest tabular benchmark to date, comprising 3030 datasets spanning diverse tasks that are comprehensively collected from diverse sources and categorized by industry using large language models. We conduct an unprecedented large-scale empirical evaluation of state-of-the-art models from all model families on OmniTabBench, confirming the absence of a dominant winner. Furthermore, through a decoupled metafeature analysis, which examines individual properties such as dataset size, feature types, feature and target skewness/kurtosis, we elucidate conditions favoring specific model categories, providing clearer, more actionable guidance than prior compound-metric studies.

Via

Access Paper or Ask Questions

CyIN: Cyclic Informative Latent Space for Bridging Complete and Incomplete Multimodal Learning

Feb 04, 2026

Ronghao Lin, Qiaolin He, Sijie Mai, Ying Zeng, Aolin Xiong, Li Huang, Yap-Peng Tan, Haifeng Hu

Abstract:Multimodal machine learning, mimicking the human brain's ability to integrate various modalities has seen rapid growth. Most previous multimodal models are trained on perfectly paired multimodal input to reach optimal performance. In real-world deployments, however, the presence of modality is highly variable and unpredictable, causing the pre-trained models in suffering significant performance drops and fail to remain robust with dynamic missing modalities circumstances. In this paper, we present a novel Cyclic INformative Learning framework (CyIN) to bridge the gap between complete and incomplete multimodal learning. Specifically, we firstly build an informative latent space by adopting token- and label-level Information Bottleneck (IB) cyclically among various modalities. Capturing task-related features with variational approximation, the informative bottleneck latents are purified for more efficient cross-modal interaction and multimodal fusion. Moreover, to supplement the missing information caused by incomplete multimodal input, we propose cross-modal cyclic translation by reconstruct the missing modalities with the remained ones through forward and reverse propagation process. With the help of the extracted and reconstructed informative latents, CyIN succeeds in jointly optimizing complete and incomplete multimodal learning in one unified model. Extensive experiments on 4 multimodal datasets demonstrate the superior performance of our method in both complete and diverse incomplete scenarios.

* Accepted by NeurIPS 2025

Via

Access Paper or Ask Questions

Intention Chain-of-Thought Prompting with Dynamic Routing for Code Generation

Dec 16, 2025

Shen Li, Li Huang, Shaoxiong Zhan, Weifeng Sun, Tao Yin, Zhongxin Liu, Meng Yan

Figure 1 for Intention Chain-of-Thought Prompting with Dynamic Routing for Code Generation

Figure 2 for Intention Chain-of-Thought Prompting with Dynamic Routing for Code Generation

Figure 3 for Intention Chain-of-Thought Prompting with Dynamic Routing for Code Generation

Figure 4 for Intention Chain-of-Thought Prompting with Dynamic Routing for Code Generation

Abstract:Large language models (LLMs) exhibit strong generative capabilities and have shown great potential in code generation. Existing chain-of-thought (CoT) prompting methods enhance model reasoning by eliciting intermediate steps, but suffer from two major limitations: First, their uniform application tends to induce overthinking on simple tasks. Second, they lack intention abstraction in code generation, such as explicitly modeling core algorithmic design and efficiency, leading models to focus on surface-level structures while neglecting the global problem objective. Inspired by the cognitive economy principle of engaging structured reasoning only when necessary to conserve cognitive resources, we propose RoutingGen, a novel difficulty-aware routing framework that dynamically adapts prompting strategies for code generation. For simple tasks, it adopts few-shot prompting; for more complex ones, it invokes a structured reasoning strategy, termed Intention Chain-of-Thought (ICoT), which we introduce to guide the model in capturing task intention, such as the core algorithmic logic and its time complexity. Experiments across three models and six standard code generation benchmarks show that RoutingGen achieves state-of-the-art performance in most settings, while reducing total token usage by 46.37% on average across settings. Furthermore, ICoT outperforms six existing prompting baselines on challenging benchmarks.

* Accepted at AAAI-2026

Via

Access Paper or Ask Questions

AMSnet 2.0: A Large AMS Database with AI Segmentation for Net Detection

May 14, 2025

Yichen Shi, Zhuofu Tao, Yuhao Gao, Li Huang, Hongyang Wang, Zhiping Yu, Ting-Jung Lin, Lei He

Figure 1 for AMSnet 2.0: A Large AMS Database with AI Segmentation for Net Detection

Figure 2 for AMSnet 2.0: A Large AMS Database with AI Segmentation for Net Detection

Figure 3 for AMSnet 2.0: A Large AMS Database with AI Segmentation for Net Detection

Figure 4 for AMSnet 2.0: A Large AMS Database with AI Segmentation for Net Detection

Abstract:Current multimodal large language models (MLLMs) struggle to understand circuit schematics due to their limited recognition capabilities. This could be attributed to the lack of high-quality schematic-netlist training data. Existing work such as AMSnet applies schematic parsing to generate netlists. However, these methods rely on hard-coded heuristics and are difficult to apply to complex or noisy schematics in this paper. We therefore propose a novel net detection mechanism based on segmentation with high robustness. The proposed method also recovers positional information, allowing digital reconstruction of schematics. We then expand AMSnet dataset with schematic images from various sources and create AMSnet 2.0. AMSnet 2.0 contains 2,686 circuits with schematic images, Spectre-formatted netlists, OpenAccess digital schematics, and positional information for circuit components and nets, whereas AMSnet only includes 792 circuits with SPICE netlists but no digital schematics.

* accepted by LAD25

Via

Access Paper or Ask Questions

SS-CTML: Self-Supervised Cross-Task Mutual Learning for CT Image Reconstruction

Dec 31, 2024

Gaofeng Chen, Yaoduo Zhang, Li Huang, Pengfei Wang, Wenyu Zhang, Dong Zeng, Jianhua Ma, Ji He

Figure 1 for SS-CTML: Self-Supervised Cross-Task Mutual Learning for CT Image Reconstruction

Figure 2 for SS-CTML: Self-Supervised Cross-Task Mutual Learning for CT Image Reconstruction

Figure 3 for SS-CTML: Self-Supervised Cross-Task Mutual Learning for CT Image Reconstruction

Figure 4 for SS-CTML: Self-Supervised Cross-Task Mutual Learning for CT Image Reconstruction

Abstract:Supervised deep-learning (SDL) techniques with paired training datasets have been widely studied for X-ray computed tomography (CT) image reconstruction. However, due to the difficulties of obtaining paired training datasets in clinical routine, the SDL methods are still away from common uses in clinical practices. In recent years, self-supervised deep-learning (SSDL) techniques have shown great potential for the studies of CT image reconstruction. In this work, we propose a self-supervised cross-task mutual learning (SS-CTML) framework for CT image reconstruction. Specifically, a sparse-view scanned and a limited-view scanned sinogram data are first extracted from a full-view scanned sinogram data, which results in three individual reconstruction tasks, i.e., the full-view CT (FVCT) reconstruction, the sparse-view CT (SVCT) reconstruction, and limited-view CT (LVCT) reconstruction. Then, three neural networks are constructed for the three reconstruction tasks. Considering that the ultimate goals of the three tasks are all to reconstruct high-quality CT images, we therefore construct a set of cross-task mutual learning objectives for the three tasks, in which way, the three neural networks can be self-supervised optimized by learning from each other. Clinical datasets are adopted to evaluate the effectiveness of the proposed framework. Experimental results demonstrate that the SS-CTML framework can obtain promising CT image reconstruction performance in terms of both quantitative and qualitative measurements.

Via

Access Paper or Ask Questions

Enhancing Data Quality through Self-learning on Imbalanced Financial Risk Data

Sep 15, 2024

Xu Sun, Zixuan Qin, Shun Zhang, Yuexian Wang, Li Huang

Figure 1 for Enhancing Data Quality through Self-learning on Imbalanced Financial Risk Data

Figure 2 for Enhancing Data Quality through Self-learning on Imbalanced Financial Risk Data

Figure 3 for Enhancing Data Quality through Self-learning on Imbalanced Financial Risk Data

Figure 4 for Enhancing Data Quality through Self-learning on Imbalanced Financial Risk Data

Abstract:In the financial risk domain, particularly in credit default prediction and fraud detection, accurate identification of high-risk class instances is paramount, as their occurrence can have significant economic implications. Although machine learning models have gained widespread adoption for risk prediction, their performance is often hindered by the scarcity and diversity of high-quality data. This limitation stems from factors in datasets such as small risk sample sizes, high labeling costs, and severe class imbalance, which impede the models' ability to learn effectively and accurately forecast critical events. This study investigates data pre-processing techniques to enhance existing financial risk datasets by introducing TriEnhance, a straightforward technique that entails: (1) generating synthetic samples specifically tailored to the minority class, (2) filtering using binary feedback to refine samples, and (3) self-learning with pseudo-labels. Our experiments across six benchmark datasets reveal the efficacy of TriEnhance, with a notable focus on improving minority class calibration, a key factor for developing more robust financial risk prediction systems.

Via

Access Paper or Ask Questions

Self Adaptive Threshold Pseudo-labeling and Unreliable Sample Contrastive Loss for Semi-supervised Image Classification

Jul 04, 2024

Xuerong Zhang, Li Huang, Jing Lv, Ming Yang

Figure 1 for Self Adaptive Threshold Pseudo-labeling and Unreliable Sample Contrastive Loss for Semi-supervised Image Classification

Figure 2 for Self Adaptive Threshold Pseudo-labeling and Unreliable Sample Contrastive Loss for Semi-supervised Image Classification

Figure 3 for Self Adaptive Threshold Pseudo-labeling and Unreliable Sample Contrastive Loss for Semi-supervised Image Classification

Figure 4 for Self Adaptive Threshold Pseudo-labeling and Unreliable Sample Contrastive Loss for Semi-supervised Image Classification

Abstract:Semi-supervised learning is attracting blooming attention, due to its success in combining unlabeled data. However, pseudo-labeling-based semi-supervised approaches suffer from two problems in image classification: (1) Existing methods might fail to adopt suitable thresholds since they either use a pre-defined/fixed threshold or an ad-hoc threshold adjusting scheme, resulting in inferior performance and slow convergence. (2) Discarding unlabeled data with confidence below the thresholds results in the loss of discriminating information. To solve these issues, we develop an effective method to make sufficient use of unlabeled data. Specifically, we design a self adaptive threshold pseudo-labeling strategy, which thresholds for each class can be dynamically adjusted to increase the number of reliable samples. Meanwhile, in order to effectively utilise unlabeled data with confidence below the thresholds, we propose an unreliable sample contrastive loss to mine the discriminative information in low-confidence samples by learning the similarities and differences between sample features. We evaluate our method on several classification benchmarks under partially labeled settings and demonstrate its superiority over the other approaches.

* ICANN24 accepted

Via

Access Paper or Ask Questions

Multiscale lubrication simulation based on fourier feature networks with trainable frequency

May 21, 2024

Yihu Tang, Li Huang, Limin Wu, Xianghui Meng

Figure 1 for Multiscale lubrication simulation based on fourier feature networks with trainable frequency

Figure 2 for Multiscale lubrication simulation based on fourier feature networks with trainable frequency

Figure 3 for Multiscale lubrication simulation based on fourier feature networks with trainable frequency

Figure 4 for Multiscale lubrication simulation based on fourier feature networks with trainable frequency

Abstract:Rough surface lubrication simulation is crucial for designing and optimizing tribological performance. Despite the growing application of Physical Information Neural Networks (PINNs) in hydrodynamic lubrication analysis, their use has been primarily limited to smooth surfaces. This is due to traditional PINN methods suffer from spectral bias, favoring to learn low-frequency features and thus failing to analyze rough surfaces with high-frequency signals. To date, no PINN methods have been reported for rough surface lubrication. To overcome these limitations, this work introduces a novel multi-scale lubrication neural network architecture that utilizes a trainable Fourier feature network. By incorporating learnable feature embedding frequencies, this architecture automatically adapts to various frequency components, thereby enhancing the analysis of rough surface characteristics. This method has been tested across multiple surface morphologies, and the results have been compared with those obtained using the finite element method (FEM). The comparative analysis demonstrates that this approach achieves a high consistency with FEM results. Furthermore, this novel architecture surpasses traditional Fourier feature networks with fixed feature embedding frequencies in both accuracy and computational efficiency. Consequently, the multi-scale lubrication neural network model offers a more efficient tool for rough surface lubrication analysis.

Via

Access Paper or Ask Questions