Picture for Md Rafi ur Rashid

Md Rafi ur Rashid

Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities

Add code
Mar 16, 2026
Viaarxiv icon

Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing

Add code
Mar 20, 2025
Viaarxiv icon

SequentialBreak: Large Language Models Can be Fooled by Embedding Jailbreak Prompts into Sequential Prompt Chains

Add code
Nov 10, 2024
Figure 1 for SequentialBreak: Large Language Models Can be Fooled by Embedding Jailbreak Prompts into Sequential Prompt Chains
Figure 2 for SequentialBreak: Large Language Models Can be Fooled by Embedding Jailbreak Prompts into Sequential Prompt Chains
Figure 3 for SequentialBreak: Large Language Models Can be Fooled by Embedding Jailbreak Prompts into Sequential Prompt Chains
Figure 4 for SequentialBreak: Large Language Models Can be Fooled by Embedding Jailbreak Prompts into Sequential Prompt Chains
Viaarxiv icon