Picture for Yiqing Xie

Yiqing Xie

An Empirical Study on Strong-Weak Model Collaboration for Repo-level Code Generation

Add code
May 26, 2025
Figure 1 for An Empirical Study on Strong-Weak Model Collaboration for Repo-level Code Generation
Figure 2 for An Empirical Study on Strong-Weak Model Collaboration for Repo-level Code Generation
Figure 3 for An Empirical Study on Strong-Weak Model Collaboration for Repo-level Code Generation
Figure 4 for An Empirical Study on Strong-Weak Model Collaboration for Repo-level Code Generation
Viaarxiv icon

RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing

Add code
Mar 10, 2025
Figure 1 for RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing
Figure 2 for RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing
Figure 3 for RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing
Figure 4 for RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing
Viaarxiv icon

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Add code
Dec 18, 2024
Viaarxiv icon

Improving Model Factuality with Fine-grained Critique-based Evaluator

Add code
Oct 24, 2024
Viaarxiv icon

CodeRAG-Bench: Can Retrieval Augment Code Generation?

Add code
Jun 20, 2024
Figure 1 for CodeRAG-Bench: Can Retrieval Augment Code Generation?
Figure 2 for CodeRAG-Bench: Can Retrieval Augment Code Generation?
Figure 3 for CodeRAG-Bench: Can Retrieval Augment Code Generation?
Figure 4 for CodeRAG-Bench: Can Retrieval Augment Code Generation?
Viaarxiv icon

CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks

Add code
Mar 31, 2024
Figure 1 for CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks
Figure 2 for CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks
Figure 3 for CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks
Figure 4 for CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks
Viaarxiv icon

Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries

Add code
Mar 01, 2024
Figure 1 for Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries
Figure 2 for Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries
Figure 3 for Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries
Figure 4 for Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries
Viaarxiv icon

Enhancing Medical Text Evaluation with GPT-4

Add code
Nov 16, 2023
Viaarxiv icon

Data Augmentation for Code Translation with Comparable Corpora and Multiple References

Add code
Nov 01, 2023
Figure 1 for Data Augmentation for Code Translation with Comparable Corpora and Multiple References
Figure 2 for Data Augmentation for Code Translation with Comparable Corpora and Multiple References
Figure 3 for Data Augmentation for Code Translation with Comparable Corpora and Multiple References
Figure 4 for Data Augmentation for Code Translation with Comparable Corpora and Multiple References
Viaarxiv icon

Learning Task Skills and Goals Simultaneously from Physical Interaction

Add code
Sep 08, 2023
Figure 1 for Learning Task Skills and Goals Simultaneously from Physical Interaction
Viaarxiv icon