Picture for Jen-tse Huang

Jen-tse Huang

FAIRGAMER: Evaluating Biases in the Application of Large Language Models to Video Games

Add code
Aug 25, 2025
Viaarxiv icon

Towards Evaluating Proactive Risk Awareness of Multimodal Language Models

Add code
May 23, 2025
Viaarxiv icon

A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?

Add code
May 16, 2025
Viaarxiv icon

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Add code
Apr 22, 2025
Viaarxiv icon

CODECRASH: Stress Testing LLM Reasoning under Structural and Semantic Perturbations

Add code
Apr 19, 2025
Viaarxiv icon

SOTOPIA-S4: a user-friendly system for flexible, customizable, and large-scale social simulation

Add code
Apr 19, 2025
Viaarxiv icon

BIASINSPECTOR: Detecting Bias in Structured Data through LLM Agents

Add code
Apr 07, 2025
Viaarxiv icon

Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs' Metacognitive Cultural Intelligence with CQ-Bench

Add code
Apr 01, 2025
Viaarxiv icon

VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models

Add code
Mar 10, 2025
Viaarxiv icon

CoSER: Coordinating LLM-Based Persona Simulation of Established Roles

Add code
Feb 13, 2025
Viaarxiv icon