Picture for Mert Yuksekgonul

Mert Yuksekgonul

TextGrad: Automatic "Differentiation" via Text

Add code
Jun 11, 2024
Viaarxiv icon

How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis

Add code
Feb 08, 2024
Figure 1 for How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Figure 2 for How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Figure 3 for How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Figure 4 for How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Viaarxiv icon

ChatGPT Exhibits Gender and Racial Biases in Acute Coronary Syndrome Management

Add code
Nov 10, 2023
Viaarxiv icon

KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval

Add code
Oct 24, 2023
Figure 1 for KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval
Figure 2 for KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval
Figure 3 for KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval
Figure 4 for KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval
Viaarxiv icon

Diversity of Thought Improves Reasoning Abilities of Large Language Models

Add code
Oct 11, 2023
Figure 1 for Diversity of Thought Improves Reasoning Abilities of Large Language Models
Figure 2 for Diversity of Thought Improves Reasoning Abilities of Large Language Models
Figure 3 for Diversity of Thought Improves Reasoning Abilities of Large Language Models
Figure 4 for Diversity of Thought Improves Reasoning Abilities of Large Language Models
Viaarxiv icon

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

Add code
Sep 26, 2023
Figure 1 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Figure 2 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Figure 3 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Figure 4 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Viaarxiv icon

Beyond Confidence: Reliable Models Should Also Consider Atypicality

Add code
May 29, 2023
Figure 1 for Beyond Confidence: Reliable Models Should Also Consider Atypicality
Figure 2 for Beyond Confidence: Reliable Models Should Also Consider Atypicality
Figure 3 for Beyond Confidence: Reliable Models Should Also Consider Atypicality
Figure 4 for Beyond Confidence: Reliable Models Should Also Consider Atypicality
Viaarxiv icon

Discover and Cure: Concept-aware Mitigation of Spurious Correlation

Add code
May 01, 2023
Figure 1 for Discover and Cure: Concept-aware Mitigation of Spurious Correlation
Figure 2 for Discover and Cure: Concept-aware Mitigation of Spurious Correlation
Figure 3 for Discover and Cure: Concept-aware Mitigation of Spurious Correlation
Figure 4 for Discover and Cure: Concept-aware Mitigation of Spurious Correlation
Viaarxiv icon

GPT detectors are biased against non-native English writers

Add code
Apr 18, 2023
Figure 1 for GPT detectors are biased against non-native English writers
Figure 2 for GPT detectors are biased against non-native English writers
Viaarxiv icon

SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis

Add code
Feb 01, 2023
Figure 1 for SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis
Figure 2 for SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis
Figure 3 for SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis
Figure 4 for SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis
Viaarxiv icon