Picture for Julien Piet

Julien Piet

Toxicity Detection for Free

Add code
May 29, 2024
Figure 1 for Toxicity Detection for Free
Figure 2 for Toxicity Detection for Free
Figure 3 for Toxicity Detection for Free
Figure 4 for Toxicity Detection for Free
Viaarxiv icon

Jatmo: Prompt Injection Defense by Task-Specific Finetuning

Add code
Jan 08, 2024
Viaarxiv icon

Mark My Words: Analyzing and Evaluating Language Model Watermarks

Add code
Dec 07, 2023
Figure 1 for Mark My Words: Analyzing and Evaluating Language Model Watermarks
Figure 2 for Mark My Words: Analyzing and Evaluating Language Model Watermarks
Figure 3 for Mark My Words: Analyzing and Evaluating Language Model Watermarks
Figure 4 for Mark My Words: Analyzing and Evaluating Language Model Watermarks
Viaarxiv icon

Asymmetric Certified Robustness via Feature-Convex Neural Networks

Add code
Feb 03, 2023
Figure 1 for Asymmetric Certified Robustness via Feature-Convex Neural Networks
Figure 2 for Asymmetric Certified Robustness via Feature-Convex Neural Networks
Figure 3 for Asymmetric Certified Robustness via Feature-Convex Neural Networks
Figure 4 for Asymmetric Certified Robustness via Feature-Convex Neural Networks
Viaarxiv icon