Alert button
Picture for Armando Solar-Lezama

Armando Solar-Lezama

Alert button

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

Add code
Bookmark button
Alert button
Mar 12, 2024
Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, Ion Stoica

Figure 1 for LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Figure 2 for LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Figure 3 for LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Figure 4 for LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Viaarxiv icon

The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations?

Add code
Bookmark button
Alert button
Feb 29, 2024
Alex Gu, Wen-Ding Li, Naman Jain, Theo X. Olausson, Celine Lee, Koushik Sen, Armando Solar-Lezama

Viaarxiv icon

CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

Add code
Bookmark button
Alert button
Jan 05, 2024
Alex Gu, Baptiste Rozière, Hugh Leather, Armando Solar-Lezama, Gabriel Synnaeve, Sida I. Wang

Viaarxiv icon

LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

Add code
Bookmark button
Alert button
Oct 23, 2023
Theo X. Olausson, Alex Gu, Benjamin Lipkin, Cedegao E. Zhang, Armando Solar-Lezama, Joshua B. Tenenbaum, Roger Levy

Viaarxiv icon

Learning a Hierarchical Planner from Humans in Multiple Generations

Add code
Bookmark button
Alert button
Oct 17, 2023
Leonardo Hernandez Cano, Yewen Pu, Robert D. Hawkins, Josh Tenenbaum, Armando Solar-Lezama

Viaarxiv icon

Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models

Add code
Bookmark button
Alert button
Jun 24, 2023
Sarah J. Zhang, Samuel Florin, Ariel N. Lee, Eamon Niknafs, Andrei Marginean, Annie Wang, Keith Tyser, Zad Chin, Yann Hicke, Nikhil Singh, Madeleine Udell, Yoon Kim, Tonio Buonassisi, Armando Solar-Lezama, Iddo Drori

Figure 1 for Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models
Figure 2 for Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models
Figure 3 for Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models
Figure 4 for Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models
Viaarxiv icon

Demystifying GPT Self-Repair for Code Generation

Add code
Bookmark button
Alert button
Jun 22, 2023
Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao, Armando Solar-Lezama

Figure 1 for Demystifying GPT Self-Repair for Code Generation
Figure 2 for Demystifying GPT Self-Repair for Code Generation
Figure 3 for Demystifying GPT Self-Repair for Code Generation
Figure 4 for Demystifying GPT Self-Repair for Code Generation
Viaarxiv icon