Picture for Jun Sakuma

Jun Sakuma

University of Tsukuba

When Benchmarks Leak: Inference-Time Decontamination for LLMs

Add code
Jan 27, 2026
Viaarxiv icon

One Leak Away: How Pretrained Model Exposure Amplifies Jailbreak Risks in Finetuned LLMs

Add code
Dec 14, 2025
Figure 1 for One Leak Away: How Pretrained Model Exposure Amplifies Jailbreak Risks in Finetuned LLMs
Figure 2 for One Leak Away: How Pretrained Model Exposure Amplifies Jailbreak Risks in Finetuned LLMs
Figure 3 for One Leak Away: How Pretrained Model Exposure Amplifies Jailbreak Risks in Finetuned LLMs
Figure 4 for One Leak Away: How Pretrained Model Exposure Amplifies Jailbreak Risks in Finetuned LLMs
Viaarxiv icon

Toward Safer Diffusion Language Models: Discovery and Mitigation of Priming Vulnerability

Add code
Oct 01, 2025
Figure 1 for Toward Safer Diffusion Language Models: Discovery and Mitigation of Priming Vulnerability
Figure 2 for Toward Safer Diffusion Language Models: Discovery and Mitigation of Priming Vulnerability
Figure 3 for Toward Safer Diffusion Language Models: Discovery and Mitigation of Priming Vulnerability
Figure 4 for Toward Safer Diffusion Language Models: Discovery and Mitigation of Priming Vulnerability
Viaarxiv icon

Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion

Add code
Mar 02, 2025
Figure 1 for Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion
Figure 2 for Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion
Figure 3 for Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion
Figure 4 for Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion
Viaarxiv icon

BADTV: Unveiling Backdoor Threats in Third-Party Task Vectors

Add code
Jan 04, 2025
Viaarxiv icon

Parameter Matching Attack: Enhancing Practical Applicability of Availability Attacks

Add code
Jul 02, 2024
Figure 1 for Parameter Matching Attack: Enhancing Practical Applicability of Availability Attacks
Figure 2 for Parameter Matching Attack: Enhancing Practical Applicability of Availability Attacks
Figure 3 for Parameter Matching Attack: Enhancing Practical Applicability of Availability Attacks
Figure 4 for Parameter Matching Attack: Enhancing Practical Applicability of Availability Attacks
Viaarxiv icon

Zero-shot domain adaptation based on dual-level mix and contrast

Add code
Jun 27, 2024
Figure 1 for Zero-shot domain adaptation based on dual-level mix and contrast
Figure 2 for Zero-shot domain adaptation based on dual-level mix and contrast
Figure 3 for Zero-shot domain adaptation based on dual-level mix and contrast
Figure 4 for Zero-shot domain adaptation based on dual-level mix and contrast
Viaarxiv icon

Behavior-Targeted Attack on Reinforcement Learning with Limited Access to Victim's Policy

Add code
Jun 06, 2024
Viaarxiv icon

Adversarial Attacks on Hidden Tasks in Multi-Task Learning

Add code
May 28, 2024
Figure 1 for Adversarial Attacks on Hidden Tasks in Multi-Task Learning
Figure 2 for Adversarial Attacks on Hidden Tasks in Multi-Task Learning
Figure 3 for Adversarial Attacks on Hidden Tasks in Multi-Task Learning
Figure 4 for Adversarial Attacks on Hidden Tasks in Multi-Task Learning
Viaarxiv icon

Harnessing the Power of Vicinity-Informed Analysis for Classification under Covariate Shift

Add code
May 27, 2024
Figure 1 for Harnessing the Power of Vicinity-Informed Analysis for Classification under Covariate Shift
Viaarxiv icon