Picture for Guang Chen

Guang Chen

Vidi: Large Multimodal Models for Video Understanding and Editing

Add code
Apr 22, 2025
Viaarxiv icon

Beyond Intermediate States: Explaining Visual Redundancy through Language

Add code
Mar 26, 2025
Viaarxiv icon

ChatBEV: A Visual Language Model that Understands BEV Maps

Add code
Mar 21, 2025
Viaarxiv icon

EmoDiffusion: Enhancing Emotional 3D Facial Animation with Latent Diffusion Models

Add code
Mar 14, 2025
Viaarxiv icon

Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition

Add code
Feb 17, 2025
Viaarxiv icon

Generative Multi-Agent Collaboration in Embodied AI: A Systematic Review

Add code
Feb 17, 2025
Viaarxiv icon

GSGTrack: Gaussian Splatting-Guided Object Pose Tracking from RGB Videos

Add code
Dec 03, 2024
Viaarxiv icon

Towards Low-Resource Harmful Meme Detection with LMM Agents

Add code
Nov 08, 2024
Figure 1 for Towards Low-Resource Harmful Meme Detection with LMM Agents
Figure 2 for Towards Low-Resource Harmful Meme Detection with LMM Agents
Figure 3 for Towards Low-Resource Harmful Meme Detection with LMM Agents
Figure 4 for Towards Low-Resource Harmful Meme Detection with LMM Agents
Viaarxiv icon

AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow

Add code
Sep 27, 2024
Figure 1 for AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow
Figure 2 for AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow
Figure 3 for AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow
Figure 4 for AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow
Viaarxiv icon

WPN: An Unlearning Method Based on N-pair Contrastive Learning in Language Models

Add code
Aug 18, 2024
Viaarxiv icon