Picture for Chirag Agarwal

Chirag Agarwal

On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models

Add code
Jun 15, 2024
Figure 1 for On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models
Figure 2 for On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models
Figure 3 for On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models
Figure 4 for On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models
Viaarxiv icon

Towards Safe and Aligned Large Language Models for Medicine

Add code
Mar 06, 2024
Figure 1 for Towards Safe and Aligned Large Language Models for Medicine
Figure 2 for Towards Safe and Aligned Large Language Models for Medicine
Viaarxiv icon

Understanding the Effects of Iterative Prompting on Truthfulness

Add code
Feb 09, 2024
Viaarxiv icon

Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models

Add code
Feb 08, 2024
Viaarxiv icon

Quantifying Uncertainty in Natural Language Explanations of Large Language Models

Add code
Nov 06, 2023
Viaarxiv icon

Are Large Language Models Post Hoc Explainers?

Add code
Oct 10, 2023
Figure 1 for Are Large Language Models Post Hoc Explainers?
Figure 2 for Are Large Language Models Post Hoc Explainers?
Figure 3 for Are Large Language Models Post Hoc Explainers?
Figure 4 for Are Large Language Models Post Hoc Explainers?
Viaarxiv icon

On the Trade-offs between Adversarial Robustness and Actionable Explanations

Add code
Sep 28, 2023
Figure 1 for On the Trade-offs between Adversarial Robustness and Actionable Explanations
Figure 2 for On the Trade-offs between Adversarial Robustness and Actionable Explanations
Figure 3 for On the Trade-offs between Adversarial Robustness and Actionable Explanations
Figure 4 for On the Trade-offs between Adversarial Robustness and Actionable Explanations
Viaarxiv icon

Certifying LLM Safety against Adversarial Prompting

Add code
Sep 06, 2023
Viaarxiv icon

Counterfactual Explanation Policies in RL

Add code
Jul 25, 2023
Figure 1 for Counterfactual Explanation Policies in RL
Figure 2 for Counterfactual Explanation Policies in RL
Figure 3 for Counterfactual Explanation Policies in RL
Figure 4 for Counterfactual Explanation Policies in RL
Viaarxiv icon

Explaining RL Decisions with Trajectories

Add code
May 06, 2023
Figure 1 for Explaining RL Decisions with Trajectories
Figure 2 for Explaining RL Decisions with Trajectories
Figure 3 for Explaining RL Decisions with Trajectories
Figure 4 for Explaining RL Decisions with Trajectories
Viaarxiv icon