Picture for Rui Qian

Rui Qian

Rethinking Image-to-Video Adaptation: An Object-centric Perspective

Add code
Jul 09, 2024
Figure 1 for Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Figure 2 for Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Figure 3 for Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Figure 4 for Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Viaarxiv icon

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Add code
Jul 03, 2024
Figure 1 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 2 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 3 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 4 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Viaarxiv icon

Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

Add code
May 27, 2024
Viaarxiv icon

Streaming Long Video Understanding with Large Language Models

Add code
May 25, 2024
Viaarxiv icon

SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation

Add code
Feb 27, 2024
Viaarxiv icon

VideoPrism: A Foundational Visual Encoder for Video Understanding

Add code
Feb 20, 2024
Figure 1 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 2 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 3 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 4 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Viaarxiv icon

Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation

Add code
Nov 29, 2023
Viaarxiv icon

Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos

Add code
Aug 19, 2023
Figure 1 for Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos
Figure 2 for Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos
Figure 3 for Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos
Figure 4 for Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos
Viaarxiv icon

Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation

Add code
Aug 08, 2023
Figure 1 for Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
Figure 2 for Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
Figure 3 for Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
Figure 4 for Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
Viaarxiv icon

Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation

Add code
Mar 18, 2023
Figure 1 for Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
Figure 2 for Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
Figure 3 for Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
Figure 4 for Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
Viaarxiv icon