Picture for Alex Gu

Alex Gu

Solving Inequality Proofs with Large Language Models

Add code
Jun 09, 2025
Viaarxiv icon

Challenges and Paths Towards AI for Software Engineering

Add code
Mar 28, 2025
Viaarxiv icon

Mixture of Parrots: Experts improve memorization more than reasoning

Add code
Oct 24, 2024
Figure 1 for Mixture of Parrots: Experts improve memorization more than reasoning
Figure 2 for Mixture of Parrots: Experts improve memorization more than reasoning
Figure 3 for Mixture of Parrots: Experts improve memorization more than reasoning
Figure 4 for Mixture of Parrots: Experts improve memorization more than reasoning
Viaarxiv icon

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Add code
Jun 26, 2024
Figure 1 for BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Figure 2 for BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Figure 3 for BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Figure 4 for BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Viaarxiv icon

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

Add code
Mar 12, 2024
Viaarxiv icon

StarCoder 2 and The Stack v2: The Next Generation

Add code
Feb 29, 2024
Figure 1 for StarCoder 2 and The Stack v2: The Next Generation
Figure 2 for StarCoder 2 and The Stack v2: The Next Generation
Figure 3 for StarCoder 2 and The Stack v2: The Next Generation
Figure 4 for StarCoder 2 and The Stack v2: The Next Generation
Viaarxiv icon

The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations?

Add code
Feb 29, 2024
Viaarxiv icon

CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

Add code
Jan 05, 2024
Viaarxiv icon

Language Agnostic Code Embeddings

Add code
Oct 25, 2023
Viaarxiv icon

LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

Add code
Oct 23, 2023
Viaarxiv icon