Picture for Tom Goldstein

Tom Goldstein

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Add code
Jun 27, 2024
Viaarxiv icon

GenQA: Generating Millions of Instructions from a Handful of Prompts

Add code
Jun 14, 2024
Viaarxiv icon

From Pixels to Prose: A Large Dataset of Dense Image Captions

Add code
Jun 14, 2024
Viaarxiv icon

PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting

Add code
Jun 14, 2024
Viaarxiv icon

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Add code
Jun 14, 2024
Figure 1 for Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Figure 2 for Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Figure 3 for Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Figure 4 for Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Viaarxiv icon

OPTune: Efficient Online Preference Tuning

Add code
Jun 11, 2024
Viaarxiv icon

The CLRS-Text Algorithmic Reasoning Language Benchmark

Add code
Jun 06, 2024
Viaarxiv icon

Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement

Add code
May 29, 2024
Viaarxiv icon

Transformers Can Do Arithmetic with the Right Embeddings

Add code
May 27, 2024
Figure 1 for Transformers Can Do Arithmetic with the Right Embeddings
Figure 2 for Transformers Can Do Arithmetic with the Right Embeddings
Figure 3 for Transformers Can Do Arithmetic with the Right Embeddings
Figure 4 for Transformers Can Do Arithmetic with the Right Embeddings
Viaarxiv icon

CinePile: A Long Video Question Answering Dataset and Benchmark

Add code
May 14, 2024
Viaarxiv icon