Picture for Michael R. Lyu

Michael R. Lyu

CODECRASH: Stress Testing LLM Reasoning under Structural and Semantic Perturbations

Add code
Apr 19, 2025
Viaarxiv icon

Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries

Add code
Feb 09, 2025
Viaarxiv icon

How Should I Build A Benchmark?

Add code
Jan 18, 2025
Viaarxiv icon

MRWeb: An Exploration of Generating Multi-Page Resource-Aware Web Code from UI Designs

Add code
Dec 19, 2024
Viaarxiv icon

XRZoo: A Large-Scale and Versatile Dataset of Extended Reality (XR) Applications

Add code
Dec 10, 2024
Viaarxiv icon

C$^2$LEVA: Toward Comprehensive and Contamination-Free Language Model Evaluation

Add code
Dec 06, 2024
Viaarxiv icon

On the Shortcut Learning in Multilingual Neural Machine Translation

Add code
Nov 15, 2024
Viaarxiv icon

Interaction2Code: How Far Are We From Automatic Interactive Webpage Generation?

Add code
Nov 05, 2024
Viaarxiv icon

Enhancing Temporal Modeling of Video LLMs via Time Gating

Add code
Oct 08, 2024
Figure 1 for Enhancing Temporal Modeling of Video LLMs via Time Gating
Figure 2 for Enhancing Temporal Modeling of Video LLMs via Time Gating
Figure 3 for Enhancing Temporal Modeling of Video LLMs via Time Gating
Figure 4 for Enhancing Temporal Modeling of Video LLMs via Time Gating
Viaarxiv icon

Learning to Ask: When LLMs Meet Unclear Instruction

Add code
Aug 31, 2024
Viaarxiv icon