Picture for Kai-Wei Chang

Kai-Wei Chang

VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation

Add code
Mar 09, 2025
Viaarxiv icon

Contrastive Visual Data Augmentation

Add code
Feb 24, 2025
Viaarxiv icon

METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling

Add code
Feb 24, 2025
Figure 1 for METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling
Figure 2 for METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling
Figure 3 for METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling
Figure 4 for METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling
Viaarxiv icon

FIG: Forward-Inverse Generation for Low-Resource Domain-specific Event Detection

Add code
Feb 24, 2025
Figure 1 for FIG: Forward-Inverse Generation for Low-Resource Domain-specific Event Detection
Figure 2 for FIG: Forward-Inverse Generation for Low-Resource Domain-specific Event Detection
Figure 3 for FIG: Forward-Inverse Generation for Low-Resource Domain-specific Event Detection
Figure 4 for FIG: Forward-Inverse Generation for Low-Resource Domain-specific Event Detection
Viaarxiv icon

Fact or Guesswork? Evaluating Large Language Model's Medical Knowledge with Structured One-Hop Judgment

Add code
Feb 20, 2025
Figure 1 for Fact or Guesswork? Evaluating Large Language Model's Medical Knowledge with Structured One-Hop Judgment
Figure 2 for Fact or Guesswork? Evaluating Large Language Model's Medical Knowledge with Structured One-Hop Judgment
Figure 3 for Fact or Guesswork? Evaluating Large Language Model's Medical Knowledge with Structured One-Hop Judgment
Figure 4 for Fact or Guesswork? Evaluating Large Language Model's Medical Knowledge with Structured One-Hop Judgment
Viaarxiv icon

LUME: LLM Unlearning with Multitask Evaluations

Add code
Feb 20, 2025
Figure 1 for LUME: LLM Unlearning with Multitask Evaluations
Figure 2 for LUME: LLM Unlearning with Multitask Evaluations
Figure 3 for LUME: LLM Unlearning with Multitask Evaluations
Figure 4 for LUME: LLM Unlearning with Multitask Evaluations
Viaarxiv icon

Enhancing LLM Character-Level Manipulation via Divide and Conquer

Add code
Feb 12, 2025
Figure 1 for Enhancing LLM Character-Level Manipulation via Divide and Conquer
Figure 2 for Enhancing LLM Character-Level Manipulation via Divide and Conquer
Figure 3 for Enhancing LLM Character-Level Manipulation via Divide and Conquer
Figure 4 for Enhancing LLM Character-Level Manipulation via Divide and Conquer
Viaarxiv icon

Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries

Add code
Feb 09, 2025
Figure 1 for Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries
Figure 2 for Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries
Figure 3 for Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries
Figure 4 for Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries
Viaarxiv icon

QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search

Add code
Feb 04, 2025
Figure 1 for QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search
Figure 2 for QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search
Figure 3 for QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search
Figure 4 for QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search
Viaarxiv icon

STIV: Scalable Text and Image Conditioned Video Generation

Add code
Dec 10, 2024
Viaarxiv icon