Picture for Chitta Baral

Chitta Baral

Shammie

REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models

Add code
Aug 05, 2024
Viaarxiv icon

Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter?

Add code
Jul 20, 2024
Viaarxiv icon

UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization

Add code
Jul 03, 2024
Figure 1 for UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization
Figure 2 for UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization
Figure 3 for UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization
Figure 4 for UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization
Viaarxiv icon

Multi-LogiEval: Towards Evaluating Multi-Step Logical Reasoning Ability of Large Language Models

Add code
Jun 24, 2024
Viaarxiv icon

Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation

Add code
Jun 08, 2024
Figure 1 for Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation
Figure 2 for Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation
Figure 3 for Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation
Figure 4 for Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation
Viaarxiv icon

ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints

Add code
Jun 06, 2024
Viaarxiv icon

Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies

Add code
Jun 06, 2024
Viaarxiv icon

Triple Preference Optimization: Achieving Better Alignment with Less Data in a Single Step Optimization

Add code
May 26, 2024
Viaarxiv icon

Grounding Stylistic Domain Generalization with Quantitative Domain Shift Measures and Synthetic Scene Images

Add code
May 24, 2024
Viaarxiv icon

Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Add code
Apr 23, 2024
Figure 1 for Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
Figure 2 for Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
Figure 3 for Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
Figure 4 for Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
Viaarxiv icon