Picture for James Zou

James Zou

Shammie

Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

Add code
May 06, 2024
Viaarxiv icon

Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs

Add code
Apr 29, 2024
Figure 1 for Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs
Figure 2 for Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs
Figure 3 for Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs
Figure 4 for Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs
Viaarxiv icon

Optimizing Calibration by Gaining Aware of Prediction Correctness

Add code
Apr 19, 2024
Figure 1 for Optimizing Calibration by Gaining Aware of Prediction Correctness
Figure 2 for Optimizing Calibration by Gaining Aware of Prediction Correctness
Figure 3 for Optimizing Calibration by Gaining Aware of Prediction Correctness
Figure 4 for Optimizing Calibration by Gaining Aware of Prediction Correctness
Viaarxiv icon

STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases

Add code
Apr 19, 2024
Figure 1 for STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
Figure 2 for STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
Figure 3 for STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
Figure 4 for STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
Viaarxiv icon

How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior

Add code
Apr 16, 2024
Figure 1 for How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior
Figure 2 for How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior
Figure 3 for How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior
Figure 4 for How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior
Viaarxiv icon

Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems

Add code
Mar 04, 2024
Figure 1 for Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
Figure 2 for Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
Figure 3 for Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
Figure 4 for Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
Viaarxiv icon

Simple linear attention language models balance the recall-throughput tradeoff

Add code
Feb 28, 2024
Figure 1 for Simple linear attention language models balance the recall-throughput tradeoff
Figure 2 for Simple linear attention language models balance the recall-throughput tradeoff
Figure 3 for Simple linear attention language models balance the recall-throughput tradeoff
Figure 4 for Simple linear attention language models balance the recall-throughput tradeoff
Viaarxiv icon

Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content

Add code
Feb 21, 2024
Viaarxiv icon

Prospector Heads: Generalized Feature Attribution for Large Models & Data

Add code
Feb 18, 2024
Viaarxiv icon

How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis

Add code
Feb 08, 2024
Figure 1 for How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Figure 2 for How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Figure 3 for How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Figure 4 for How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Viaarxiv icon