Picture for Eric Wong

Eric Wong

Towards Compositionality in Concept Learning

Add code
Jun 26, 2024
Figure 1 for Towards Compositionality in Concept Learning
Figure 2 for Towards Compositionality in Concept Learning
Figure 3 for Towards Compositionality in Concept Learning
Figure 4 for Towards Compositionality in Concept Learning
Viaarxiv icon

Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

Add code
Jun 21, 2024
Viaarxiv icon

Avoiding Copyright Infringement via Machine Unlearning

Add code
Jun 16, 2024
Viaarxiv icon

Data-Efficient Learning with Neural Programs

Add code
Jun 10, 2024
Viaarxiv icon

DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation

Add code
Jun 02, 2024
Viaarxiv icon

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

Add code
Mar 28, 2024
Figure 1 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Figure 2 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Figure 3 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Figure 4 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Viaarxiv icon

Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing

Add code
Feb 28, 2024
Figure 1 for Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Figure 2 for Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Figure 3 for Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Figure 4 for Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Viaarxiv icon

Initialization Matters for Adversarial Transfer Learning

Add code
Dec 10, 2023
Viaarxiv icon

Sum-of-Parts Models: Faithful Attributions for Groups of Features

Add code
Oct 25, 2023
Figure 1 for Sum-of-Parts Models: Faithful Attributions for Groups of Features
Figure 2 for Sum-of-Parts Models: Faithful Attributions for Groups of Features
Figure 3 for Sum-of-Parts Models: Faithful Attributions for Groups of Features
Figure 4 for Sum-of-Parts Models: Faithful Attributions for Groups of Features
Viaarxiv icon

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Add code
Oct 19, 2023
Figure 1 for SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
Figure 2 for SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
Figure 3 for SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
Figure 4 for SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
Viaarxiv icon