Picture for Jiasen Lu

Jiasen Lu

UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation

Add code
May 20, 2025
Viaarxiv icon

GIE-Bench: Towards Grounded Evaluation for Text-Guided Image Editing

Add code
May 16, 2025
Viaarxiv icon

SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding

Add code
Mar 27, 2025
Viaarxiv icon

STIV: Scalable Text and Image Conditioned Video Generation

Add code
Dec 10, 2024
Viaarxiv icon

One Diffusion to Generate Them All

Add code
Nov 25, 2024
Figure 1 for One Diffusion to Generate Them All
Figure 2 for One Diffusion to Generate Them All
Figure 3 for One Diffusion to Generate Them All
Figure 4 for One Diffusion to Generate Them All
Viaarxiv icon

The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities

Add code
Nov 07, 2024
Figure 1 for The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Figure 2 for The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Figure 3 for The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Figure 4 for The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Viaarxiv icon

MM-Ego: Towards Building Egocentric Multimodal LLMs

Add code
Oct 09, 2024
Figure 1 for MM-Ego: Towards Building Egocentric Multimodal LLMs
Figure 2 for MM-Ego: Towards Building Egocentric Multimodal LLMs
Figure 3 for MM-Ego: Towards Building Egocentric Multimodal LLMs
Figure 4 for MM-Ego: Towards Building Egocentric Multimodal LLMs
Viaarxiv icon

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Add code
Sep 25, 2024
Figure 1 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Figure 2 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Figure 3 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Figure 4 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Viaarxiv icon

SoupLM: Model Integration in Large Language and Multi-Modal Models

Add code
Jul 11, 2024
Viaarxiv icon

Preserving Identity with Variational Score for General-purpose 3D Editing

Add code
Jun 13, 2024
Viaarxiv icon