Picture for Chen Sun

Chen Sun

Learning Visual Grounding from Generative Vision and Language Model

Add code
Jul 18, 2024
Figure 1 for Learning Visual Grounding from Generative Vision and Language Model
Figure 2 for Learning Visual Grounding from Generative Vision and Language Model
Figure 3 for Learning Visual Grounding from Generative Vision and Language Model
Figure 4 for Learning Visual Grounding from Generative Vision and Language Model
Viaarxiv icon

Potential Based Diffusion Motion Planning

Add code
Jul 08, 2024
Viaarxiv icon

Text-Aware Diffusion for Policy Learning

Add code
Jul 02, 2024
Figure 1 for Text-Aware Diffusion for Policy Learning
Figure 2 for Text-Aware Diffusion for Policy Learning
Figure 3 for Text-Aware Diffusion for Policy Learning
Figure 4 for Text-Aware Diffusion for Policy Learning
Viaarxiv icon

Multi-Beam Integrated Sensing and Communication: State-of-the-Art, Challenges and Opportunities

Add code
May 31, 2024
Figure 1 for Multi-Beam Integrated Sensing and Communication: State-of-the-Art, Challenges and Opportunities
Figure 2 for Multi-Beam Integrated Sensing and Communication: State-of-the-Art, Challenges and Opportunities
Figure 3 for Multi-Beam Integrated Sensing and Communication: State-of-the-Art, Challenges and Opportunities
Figure 4 for Multi-Beam Integrated Sensing and Communication: State-of-the-Art, Challenges and Opportunities
Viaarxiv icon

Pre-trained Vision-Language Models Learn Discoverable Visual Concepts

Add code
Apr 19, 2024
Viaarxiv icon

Precoder Design for User-Centric Network Massive MIMO with Matrix Manifold Optimization

Add code
Apr 11, 2024
Viaarxiv icon

Self-Correcting Self-Consuming Loops for Generative Model Training

Add code
Feb 11, 2024
Figure 1 for Self-Correcting Self-Consuming Loops for Generative Model Training
Figure 2 for Self-Correcting Self-Consuming Loops for Generative Model Training
Figure 3 for Self-Correcting Self-Consuming Loops for Generative Model Training
Figure 4 for Self-Correcting Self-Consuming Loops for Generative Model Training
Viaarxiv icon

Pixel Aligned Language Models

Add code
Dec 14, 2023
Viaarxiv icon

Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains

Add code
Nov 30, 2023
Figure 1 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Figure 2 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Figure 3 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Figure 4 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Viaarxiv icon

Vamos: Versatile Action Models for Video Understanding

Add code
Nov 22, 2023
Figure 1 for Vamos: Versatile Action Models for Video Understanding
Figure 2 for Vamos: Versatile Action Models for Video Understanding
Figure 3 for Vamos: Versatile Action Models for Video Understanding
Figure 4 for Vamos: Versatile Action Models for Video Understanding
Viaarxiv icon