Picture for Yuxuan Lu

Yuxuan Lu

Multi-Agent-as-Judge: Aligning LLM-Agent-Based Automated Evaluation with Multi-Dimensional Human Evaluation

Add code
Jul 28, 2025
Viaarxiv icon

Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning

Add code
Jul 23, 2025
Viaarxiv icon

Aligned Textual Scoring Rules

Add code
Jul 08, 2025
Viaarxiv icon

OPeRA: A Dataset of Observation, Persona, Rationale, and Action for Evaluating LLMs on Human Online Shopping Behavior Simulation

Add code
Jun 05, 2025
Viaarxiv icon

AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents

Add code
Apr 13, 2025
Viaarxiv icon

UXAgent: A System for Simulating Usability Testing of Web Design with LLM Agents

Add code
Apr 13, 2025
Viaarxiv icon

Beyond Believability: Accurate Human Behavior Simulation with Fine-Tuned LLMs

Add code
Mar 27, 2025
Viaarxiv icon

UXAgent: An LLM Agent-Based Usability Testing Framework for Web Design

Add code
Feb 18, 2025
Figure 1 for UXAgent: An LLM Agent-Based Usability Testing Framework for Web Design
Figure 2 for UXAgent: An LLM Agent-Based Usability Testing Framework for Web Design
Figure 3 for UXAgent: An LLM Agent-Based Usability Testing Framework for Web Design
Figure 4 for UXAgent: An LLM Agent-Based Usability Testing Framework for Web Design
Viaarxiv icon

RECOVER: Designing a Large Language Model-based Remote Patient Monitoring System for Postoperative Gastrointestinal Cancer Care

Add code
Feb 09, 2025
Viaarxiv icon

Benchmarking LLMs' Judgments with No Gold Standard

Add code
Nov 11, 2024
Figure 1 for Benchmarking LLMs' Judgments with No Gold Standard
Figure 2 for Benchmarking LLMs' Judgments with No Gold Standard
Figure 3 for Benchmarking LLMs' Judgments with No Gold Standard
Figure 4 for Benchmarking LLMs' Judgments with No Gold Standard
Viaarxiv icon