Picture for Bin Zhu

Bin Zhu

Preference Optimization for Combinatorial Optimization Problems

Add code
May 13, 2025
Viaarxiv icon

Advancing Food Nutrition Estimation via Visual-Ingredient Feature Fusion

Add code
May 13, 2025
Viaarxiv icon

Don't Deceive Me: Mitigating Gaslighting through Attention Reallocation in LMMs

Add code
Apr 13, 2025
Viaarxiv icon

Evaluating LLM-based Agents for Multi-Turn Conversations: A Survey

Add code
Mar 28, 2025
Viaarxiv icon

SwapAnyone: Consistent and Realistic Video Synthesis for Swapping Any Person into Any Video

Add code
Mar 12, 2025
Viaarxiv icon

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Add code
Mar 10, 2025
Viaarxiv icon

OSCAR: Object Status and Contextual Awareness for Recipes to Support Non-Visual Cooking

Add code
Mar 07, 2025
Viaarxiv icon

HD-EPIC: A Highly-Detailed Egocentric Video Dataset

Add code
Feb 06, 2025
Viaarxiv icon

Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation

Add code
Jan 31, 2025
Figure 1 for Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation
Figure 2 for Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation
Figure 3 for Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation
Figure 4 for Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation
Viaarxiv icon

Next Patch Prediction for Autoregressive Visual Generation

Add code
Dec 19, 2024
Viaarxiv icon