Alert button
Picture for Yu-Gang Jiang

Yu-Gang Jiang

Alert button

Fake Alignment: Are LLMs Really Aligned Well?

Nov 14, 2023
Yixu Wang, Yan Teng, Kexin Huang, Chengqi Lyu, Songyang Zhang, Wenwei Zhang, Xingjun Ma, Yu-Gang Jiang, Yu Qiao, Yingchun Wang

Viaarxiv icon

To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning

Nov 13, 2023
Junke Wang, Lingchen Meng, Zejia Weng, Bo He, Zuxuan Wu, Yu-Gang Jiang

Viaarxiv icon

Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection

Oct 18, 2023
Lingchen Meng, Xiyang Dai, Jianwei Yang, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Yi-Ling Chen, Zuxuan Wu, Lu Yuan, Yu-Gang Jiang

Figure 1 for Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Figure 2 for Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Figure 3 for Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Figure 4 for Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Viaarxiv icon

A Survey on Video Diffusion Models

Oct 16, 2023
Zhen Xing, Qijun Feng, Haoran Chen, Qi Dai, Han Hu, Hang Xu, Zuxuan Wu, Yu-Gang Jiang

Viaarxiv icon

Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data

Oct 08, 2023
Zuxuan Wu, Zejia Weng, Wujian Peng, Xitong Yang, Ang Li, Larry S. Davis, Yu-Gang Jiang

Figure 1 for Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data
Figure 2 for Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data
Figure 3 for Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data
Figure 4 for Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data
Viaarxiv icon

Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation

Sep 07, 2023
Jiaxi Gu, Shicong Wang, Haoyu Zhao, Tianyi Lu, Xing Zhang, Zuxuan Wu, Songcen Xu, Wei Zhang, Yu-Gang Jiang, Hang Xu

Figure 1 for Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Figure 2 for Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Figure 3 for Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Figure 4 for Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Viaarxiv icon

SimDA: Simple Diffusion Adapter for Efficient Video Generation

Aug 18, 2023
Zhen Xing, Qi Dai, Han Hu, Zuxuan Wu, Yu-Gang Jiang

Figure 1 for SimDA: Simple Diffusion Adapter for Efficient Video Generation
Figure 2 for SimDA: Simple Diffusion Adapter for Efficient Video Generation
Figure 3 for SimDA: Simple Diffusion Adapter for Efficient Video Generation
Figure 4 for SimDA: Simple Diffusion Adapter for Efficient Video Generation
Viaarxiv icon

On the Importance of Spatial Relations for Few-shot Action Recognition

Aug 14, 2023
Yilun Zhang, Yuqian Fu, Xingjun Ma, Lizhe Qi, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang

Figure 1 for On the Importance of Spatial Relations for Few-shot Action Recognition
Figure 2 for On the Importance of Spatial Relations for Few-shot Action Recognition
Figure 3 for On the Importance of Spatial Relations for Few-shot Action Recognition
Figure 4 for On the Importance of Spatial Relations for Few-shot Action Recognition
Viaarxiv icon

Context Perception Parallel Decoder for Scene Text Recognition

Jul 23, 2023
Yongkun Du, Zhineng Chen, Caiyan Jia, Xiaoting Yin, Chenxia Li, Yuning Du, Yu-Gang Jiang

Figure 1 for Context Perception Parallel Decoder for Scene Text Recognition
Figure 2 for Context Perception Parallel Decoder for Scene Text Recognition
Figure 3 for Context Perception Parallel Decoder for Scene Text Recognition
Figure 4 for Context Perception Parallel Decoder for Scene Text Recognition
Viaarxiv icon

Prompting Large Language Models to Reformulate Queries for Moment Localization

Jun 06, 2023
Wenfeng Yan, Shaoxiang Chen, Zuxuan Wu, Yu-Gang Jiang

Figure 1 for Prompting Large Language Models to Reformulate Queries for Moment Localization
Figure 2 for Prompting Large Language Models to Reformulate Queries for Moment Localization
Figure 3 for Prompting Large Language Models to Reformulate Queries for Moment Localization
Viaarxiv icon