Picture for Richard Zemel

Richard Zemel

Replay Can Provably Increase Forgetting

Add code
Jun 04, 2025
Viaarxiv icon

Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation

Add code
May 27, 2025
Viaarxiv icon

Adaptive Elicitation of Latent Information Using Natural Language

Add code
Apr 05, 2025
Viaarxiv icon

Towards Effective Discrimination Testing for Generative AI

Add code
Dec 30, 2024
Figure 1 for Towards Effective Discrimination Testing for Generative AI
Figure 2 for Towards Effective Discrimination Testing for Generative AI
Figure 3 for Towards Effective Discrimination Testing for Generative AI
Figure 4 for Towards Effective Discrimination Testing for Generative AI
Viaarxiv icon

Improving Predictor Reliability with Selective Recalibration

Add code
Oct 07, 2024
Figure 1 for Improving Predictor Reliability with Selective Recalibration
Figure 2 for Improving Predictor Reliability with Selective Recalibration
Figure 3 for Improving Predictor Reliability with Selective Recalibration
Figure 4 for Improving Predictor Reliability with Selective Recalibration
Viaarxiv icon

Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification

Add code
Oct 07, 2024
Figure 1 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Figure 2 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Figure 3 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Figure 4 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Viaarxiv icon

Controlling the World by Sleight of Hand

Add code
Aug 13, 2024
Figure 1 for Controlling the World by Sleight of Hand
Figure 2 for Controlling the World by Sleight of Hand
Figure 3 for Controlling the World by Sleight of Hand
Figure 4 for Controlling the World by Sleight of Hand
Viaarxiv icon

Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities

Add code
Jun 20, 2024
Figure 1 for Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Figure 2 for Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Figure 3 for Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Figure 4 for Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Viaarxiv icon

Integrating Present and Past in Unsupervised Continual Learning

Add code
Apr 29, 2024
Figure 1 for Integrating Present and Past in Unsupervised Continual Learning
Figure 2 for Integrating Present and Past in Unsupervised Continual Learning
Figure 3 for Integrating Present and Past in Unsupervised Continual Learning
Figure 4 for Integrating Present and Past in Unsupervised Continual Learning
Viaarxiv icon

Toward Informal Language Processing: Knowledge of Slang in Large Language Models

Add code
Apr 13, 2024
Figure 1 for Toward Informal Language Processing: Knowledge of Slang in Large Language Models
Figure 2 for Toward Informal Language Processing: Knowledge of Slang in Large Language Models
Figure 3 for Toward Informal Language Processing: Knowledge of Slang in Large Language Models
Figure 4 for Toward Informal Language Processing: Knowledge of Slang in Large Language Models
Viaarxiv icon