Picture for Wang Lin

Wang Lin

Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning

Add code
Apr 22, 2025
Viaarxiv icon

Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens

Add code
Apr 20, 2025
Viaarxiv icon

Online Controller Synthesis for Robot Collision Avoidance: A Case Study

Add code
Feb 08, 2025
Viaarxiv icon

Low-rank Prompt Interaction for Continual Vision-Language Retrieval

Add code
Jan 24, 2025
Figure 1 for Low-rank Prompt Interaction for Continual Vision-Language Retrieval
Figure 2 for Low-rank Prompt Interaction for Continual Vision-Language Retrieval
Figure 3 for Low-rank Prompt Interaction for Continual Vision-Language Retrieval
Figure 4 for Low-rank Prompt Interaction for Continual Vision-Language Retrieval
Viaarxiv icon

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining

Add code
Dec 13, 2024
Figure 1 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 2 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 3 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 4 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Viaarxiv icon

Bridging the Gap for Test-Time Multimodal Sentiment Analysis

Add code
Dec 10, 2024
Figure 1 for Bridging the Gap for Test-Time Multimodal Sentiment Analysis
Figure 2 for Bridging the Gap for Test-Time Multimodal Sentiment Analysis
Figure 3 for Bridging the Gap for Test-Time Multimodal Sentiment Analysis
Figure 4 for Bridging the Gap for Test-Time Multimodal Sentiment Analysis
Viaarxiv icon

Semantic Alignment for Multimodal Large Language Models

Add code
Aug 23, 2024
Figure 1 for Semantic Alignment for Multimodal Large Language Models
Figure 2 for Semantic Alignment for Multimodal Large Language Models
Figure 3 for Semantic Alignment for Multimodal Large Language Models
Figure 4 for Semantic Alignment for Multimodal Large Language Models
Viaarxiv icon

Instruction Tuning-free Visual Token Complement for Multimodal LLMs

Add code
Aug 09, 2024
Viaarxiv icon

EAGER: Two-Stream Generative Recommender with Behavior-Semantic Collaboration

Add code
Jun 20, 2024
Viaarxiv icon

Non-confusing Generation of Customized Concepts in Diffusion Models

Add code
May 11, 2024
Viaarxiv icon