Picture for Sanghyun Hong

Sanghyun Hong

Discovering Universal Activation Directions for PII Leakage in Language Models

Add code
Feb 19, 2026
Viaarxiv icon

Fail-Closed Alignment for Large Language Models

Add code
Feb 19, 2026
Viaarxiv icon

Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents

Add code
Aug 23, 2025
Viaarxiv icon

MADCAT: Combating Malware Detection Under Concept Drift with Test-Time Adaptation

Add code
May 24, 2025
Viaarxiv icon

Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models

Add code
May 01, 2025
Figure 1 for Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models
Figure 2 for Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models
Figure 3 for Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models
Figure 4 for Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models
Viaarxiv icon

Hessian-aware Training for Enhancing DNNs Resilience to Parameter Corruptions

Add code
Apr 02, 2025
Figure 1 for Hessian-aware Training for Enhancing DNNs Resilience to Parameter Corruptions
Figure 2 for Hessian-aware Training for Enhancing DNNs Resilience to Parameter Corruptions
Figure 3 for Hessian-aware Training for Enhancing DNNs Resilience to Parameter Corruptions
Figure 4 for Hessian-aware Training for Enhancing DNNs Resilience to Parameter Corruptions
Viaarxiv icon

Understanding and Mitigating Membership Inference Risks of Neural Ordinary Differential Equations

Add code
Jan 12, 2025
Figure 1 for Understanding and Mitigating Membership Inference Risks of Neural Ordinary Differential Equations
Figure 2 for Understanding and Mitigating Membership Inference Risks of Neural Ordinary Differential Equations
Figure 3 for Understanding and Mitigating Membership Inference Risks of Neural Ordinary Differential Equations
Figure 4 for Understanding and Mitigating Membership Inference Risks of Neural Ordinary Differential Equations
Viaarxiv icon

PrisonBreak: Jailbreaking Large Language Models with Fewer Than Twenty-Five Targeted Bit-flips

Add code
Dec 10, 2024
Figure 1 for PrisonBreak: Jailbreaking Large Language Models with Fewer Than Twenty-Five Targeted Bit-flips
Figure 2 for PrisonBreak: Jailbreaking Large Language Models with Fewer Than Twenty-Five Targeted Bit-flips
Figure 3 for PrisonBreak: Jailbreaking Large Language Models with Fewer Than Twenty-Five Targeted Bit-flips
Figure 4 for PrisonBreak: Jailbreaking Large Language Models with Fewer Than Twenty-Five Targeted Bit-flips
Viaarxiv icon

SoK: Watermarking for AI-Generated Content

Add code
Nov 27, 2024
Figure 1 for SoK: Watermarking for AI-Generated Content
Figure 2 for SoK: Watermarking for AI-Generated Content
Viaarxiv icon

You Never Know: Quantization Induces Inconsistent Biases in Vision-Language Foundation Models

Add code
Oct 26, 2024
Viaarxiv icon