Picture for Carolyn Rose

Carolyn Rose

PBEBench: A Multi-Step Programming by Examples Reasoning Benchmark inspired by Historical Linguistics

Add code
May 29, 2025
Viaarxiv icon

An Empirical Study on Strong-Weak Model Collaboration for Repo-level Code Generation

Add code
May 26, 2025
Viaarxiv icon

Where is this coming from? Making groundedness count in the evaluation of Document VQA models

Add code
Mar 24, 2025
Viaarxiv icon

RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing

Add code
Mar 10, 2025
Viaarxiv icon

Programming by Examples Meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction

Add code
Jan 27, 2025
Figure 1 for Programming by Examples Meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction
Figure 2 for Programming by Examples Meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction
Figure 3 for Programming by Examples Meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction
Figure 4 for Programming by Examples Meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction
Viaarxiv icon

Improving Model Factuality with Fine-grained Critique-based Evaluator

Add code
Oct 24, 2024
Viaarxiv icon

CRScore: Grounding Automated Evaluation of Code Review Comments in Code Claims and Smells

Add code
Sep 29, 2024
Viaarxiv icon

CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks

Add code
Mar 31, 2024
Figure 1 for CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks
Figure 2 for CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks
Figure 3 for CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks
Figure 4 for CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks
Viaarxiv icon

Data Augmentation for Code Translation with Comparable Corpora and Multiple References

Add code
Nov 01, 2023
Viaarxiv icon

Linguistic representations for fewer-shot relation extraction across domains

Add code
Jul 07, 2023
Viaarxiv icon