Picture for Shing-Chi Cheung

Shing-Chi Cheung

ModelWisdom: An Integrated Toolkit for TLA+ Model Visualization, Digest and Repair

Add code
Feb 12, 2026
Viaarxiv icon

An Empirical Study of Bugs in Data Visualization Libraries

Add code
Jun 18, 2025
Viaarxiv icon

Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models

Add code
May 29, 2025
Figure 1 for Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models
Figure 2 for Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models
Figure 3 for Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models
Figure 4 for Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models
Viaarxiv icon

IP Leakage Attacks Targeting LLM-Based Multi-Agent Systems

Add code
May 18, 2025
Viaarxiv icon

Isolating Language-Coding from Problem-Solving: Benchmarking LLMs with PseudoEval

Add code
Feb 26, 2025
Viaarxiv icon

From Informal to Formal -- Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs

Add code
Jan 27, 2025
Viaarxiv icon

How Should I Build A Benchmark?

Add code
Jan 18, 2025
Viaarxiv icon

CODECLEANER: Elevating Standards with A Robust Data Contamination Mitigation Toolkit

Add code
Nov 16, 2024
Viaarxiv icon

CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution

Add code
Aug 23, 2024
Figure 1 for CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution
Figure 2 for CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution
Figure 3 for CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution
Figure 4 for CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution
Viaarxiv icon

DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation

Add code
Aug 23, 2024
Figure 1 for DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation
Figure 2 for DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation
Figure 3 for DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation
Figure 4 for DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation
Viaarxiv icon