Picture for Yuheng Huang

Yuheng Huang

Risk Assessment Framework for Code LLMs via Leveraging Internal States

Add code
Apr 20, 2025
Viaarxiv icon

No More Tuning: Prioritized Multi-Task Learning with Lagrangian Differential Multiplier Methods

Add code
Dec 16, 2024
Figure 1 for No More Tuning: Prioritized Multi-Task Learning with Lagrangian Differential Multiplier Methods
Figure 2 for No More Tuning: Prioritized Multi-Task Learning with Lagrangian Differential Multiplier Methods
Figure 3 for No More Tuning: Prioritized Multi-Task Learning with Lagrangian Differential Multiplier Methods
Figure 4 for No More Tuning: Prioritized Multi-Task Learning with Lagrangian Differential Multiplier Methods
Viaarxiv icon

Towards Understanding Retrieval Accuracy and Prompt Quality in RAG Systems

Add code
Nov 29, 2024
Viaarxiv icon

LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation

Add code
Oct 07, 2024
Viaarxiv icon

LeCov: Multi-level Testing Criteria for Large Language Models

Add code
Aug 20, 2024
Figure 1 for LeCov: Multi-level Testing Criteria for Large Language Models
Figure 2 for LeCov: Multi-level Testing Criteria for Large Language Models
Figure 3 for LeCov: Multi-level Testing Criteria for Large Language Models
Figure 4 for LeCov: Multi-level Testing Criteria for Large Language Models
Viaarxiv icon

Active Testing of Large Language Model via Multi-Stage Sampling

Add code
Aug 07, 2024
Viaarxiv icon

Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture

Add code
Jul 10, 2024
Figure 1 for Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture
Figure 2 for Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture
Figure 3 for Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture
Figure 4 for Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture
Viaarxiv icon

Vortex under Ripplet: An Empirical Study of RAG-enabled Applications

Add code
Jul 06, 2024
Figure 1 for Vortex under Ripplet: An Empirical Study of RAG-enabled Applications
Figure 2 for Vortex under Ripplet: An Empirical Study of RAG-enabled Applications
Figure 3 for Vortex under Ripplet: An Empirical Study of RAG-enabled Applications
Figure 4 for Vortex under Ripplet: An Empirical Study of RAG-enabled Applications
Viaarxiv icon

Enhancing Fault Detection for Large Language Models via Mutation-Based Confidence Smoothing

Add code
Apr 14, 2024
Figure 1 for Enhancing Fault Detection for Large Language Models via Mutation-Based Confidence Smoothing
Figure 2 for Enhancing Fault Detection for Large Language Models via Mutation-Based Confidence Smoothing
Figure 3 for Enhancing Fault Detection for Large Language Models via Mutation-Based Confidence Smoothing
Figure 4 for Enhancing Fault Detection for Large Language Models via Mutation-Based Confidence Smoothing
Viaarxiv icon

Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward

Add code
Apr 12, 2024
Viaarxiv icon