Picture for Yansong Tang

Yansong Tang

Fully Aligned Network for Referring Image Segmentation

Add code
Sep 29, 2024
Viaarxiv icon

Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model

Add code
Aug 01, 2024
Viaarxiv icon

Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena

Add code
Jul 15, 2024
Viaarxiv icon

RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

Add code
Jul 11, 2024
Viaarxiv icon

Hierarchical Memory for Long Video QA

Add code
Jun 30, 2024
Viaarxiv icon

LIPE: Learning Personalized Identity Prior for Non-rigid Image Editing

Add code
Jun 25, 2024
Viaarxiv icon

GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation

Add code
Jun 21, 2024
Viaarxiv icon

VoCo-LLaMA: Towards Vision Compression with Large Language Models

Add code
Jun 18, 2024
Viaarxiv icon

Localizing Events in Videos with Multimodal Queries

Add code
Jun 14, 2024
Viaarxiv icon

Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams

Add code
Jun 12, 2024
Viaarxiv icon