Picture for Avi Schwarzschild

Avi Schwarzschild

Forcing Diffuse Distributions out of Language Models

Add code
Apr 16, 2024
Figure 1 for Forcing Diffuse Distributions out of Language Models
Figure 2 for Forcing Diffuse Distributions out of Language Models
Figure 3 for Forcing Diffuse Distributions out of Language Models
Figure 4 for Forcing Diffuse Distributions out of Language Models
Viaarxiv icon

Benchmarking ChatGPT on Algorithmic Reasoning

Add code
Apr 04, 2024
Figure 1 for Benchmarking ChatGPT on Algorithmic Reasoning
Figure 2 for Benchmarking ChatGPT on Algorithmic Reasoning
Figure 3 for Benchmarking ChatGPT on Algorithmic Reasoning
Figure 4 for Benchmarking ChatGPT on Algorithmic Reasoning
Viaarxiv icon

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

Add code
Jan 22, 2024
Figure 1 for Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Figure 2 for Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Figure 3 for Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Figure 4 for Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Viaarxiv icon

TOFU: A Task of Fictitious Unlearning for LLMs

Add code
Jan 11, 2024
Viaarxiv icon

Effective Backdoor Mitigation Depends on the Pre-training Objective

Add code
Dec 05, 2023
Figure 1 for Effective Backdoor Mitigation Depends on the Pre-training Objective
Figure 2 for Effective Backdoor Mitigation Depends on the Pre-training Objective
Figure 3 for Effective Backdoor Mitigation Depends on the Pre-training Objective
Figure 4 for Effective Backdoor Mitigation Depends on the Pre-training Objective
Viaarxiv icon

NEFTune: Noisy Embeddings Improve Instruction Finetuning

Add code
Oct 10, 2023
Figure 1 for NEFTune: Noisy Embeddings Improve Instruction Finetuning
Figure 2 for NEFTune: Noisy Embeddings Improve Instruction Finetuning
Figure 3 for NEFTune: Noisy Embeddings Improve Instruction Finetuning
Figure 4 for NEFTune: Noisy Embeddings Improve Instruction Finetuning
Viaarxiv icon

Baseline Defenses for Adversarial Attacks Against Aligned Language Models

Add code
Sep 04, 2023
Figure 1 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Figure 2 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Figure 3 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Figure 4 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Viaarxiv icon

A Cookbook of Self-Supervised Learning

Add code
Apr 24, 2023
Figure 1 for A Cookbook of Self-Supervised Learning
Figure 2 for A Cookbook of Self-Supervised Learning
Figure 3 for A Cookbook of Self-Supervised Learning
Figure 4 for A Cookbook of Self-Supervised Learning
Viaarxiv icon

Reckoning with the Disagreement Problem: Explanation Consensus as a Training Objective

Add code
Mar 23, 2023
Figure 1 for Reckoning with the Disagreement Problem: Explanation Consensus as a Training Objective
Figure 2 for Reckoning with the Disagreement Problem: Explanation Consensus as a Training Objective
Figure 3 for Reckoning with the Disagreement Problem: Explanation Consensus as a Training Objective
Figure 4 for Reckoning with the Disagreement Problem: Explanation Consensus as a Training Objective
Viaarxiv icon

Neural Auctions Compromise Bidder Information

Add code
Feb 28, 2023
Figure 1 for Neural Auctions Compromise Bidder Information
Figure 2 for Neural Auctions Compromise Bidder Information
Figure 3 for Neural Auctions Compromise Bidder Information
Figure 4 for Neural Auctions Compromise Bidder Information
Viaarxiv icon