Alert button
Picture for Songyang Zhang

Songyang Zhang

Alert button

Learning Referring Video Object Segmentation from Weak Annotation

Add code
Bookmark button
Alert button
Aug 04, 2023
Wangbo Zhao, Kepan Nan, Songyang Zhang, Kai Chen, Dahua Lin, Yang You

Figure 1 for Learning Referring Video Object Segmentation from Weak Annotation
Figure 2 for Learning Referring Video Object Segmentation from Weak Annotation
Figure 3 for Learning Referring Video Object Segmentation from Weak Annotation
Figure 4 for Learning Referring Video Object Segmentation from Weak Annotation
Viaarxiv icon

Improving Pixel-based MIM by Reducing Wasted Modeling Capability

Add code
Bookmark button
Alert button
Aug 01, 2023
Yuan Liu, Songyang Zhang, Jiacheng Chen, Zhaohui Yu, Kai Chen, Dahua Lin

Figure 1 for Improving Pixel-based MIM by Reducing Wasted Modeling Capability
Figure 2 for Improving Pixel-based MIM by Reducing Wasted Modeling Capability
Figure 3 for Improving Pixel-based MIM by Reducing Wasted Modeling Capability
Figure 4 for Improving Pixel-based MIM by Reducing Wasted Modeling Capability
Viaarxiv icon

MMBench: Is Your Multi-modal Model an All-around Player?

Add code
Bookmark button
Alert button
Jul 26, 2023
Yuan Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, Kai Chen, Dahua Lin

Figure 1 for MMBench: Is Your Multi-modal Model an All-around Player?
Figure 2 for MMBench: Is Your Multi-modal Model an All-around Player?
Figure 3 for MMBench: Is Your Multi-modal Model an All-around Player?
Figure 4 for MMBench: Is Your Multi-modal Model an All-around Player?
Viaarxiv icon

Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA

Add code
Bookmark button
Alert button
May 31, 2023
Ali Vosoughi, Shijian Deng, Songyang Zhang, Yapeng Tian, Chenliang Xu, Jiebo Luo

Figure 1 for Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA
Figure 2 for Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA
Figure 3 for Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA
Figure 4 for Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA
Viaarxiv icon

Radiomap Inpainting for Restricted Areas based on Propagation Priority and Depth Map

Add code
Bookmark button
Alert button
May 24, 2023
Songyang Zhang, Tianhang Yu, Brian Choi, Feng Ouyang, Zhi Ding

Figure 1 for Radiomap Inpainting for Restricted Areas based on Propagation Priority and Depth Map
Figure 2 for Radiomap Inpainting for Restricted Areas based on Propagation Priority and Depth Map
Figure 3 for Radiomap Inpainting for Restricted Areas based on Propagation Priority and Depth Map
Figure 4 for Radiomap Inpainting for Restricted Areas based on Propagation Priority and Depth Map
Viaarxiv icon

PS-FedGAN: An Efficient Federated Learning Framework Based on Partially Shared Generative Adversarial Networks For Data Privacy

Add code
Bookmark button
Alert button
May 19, 2023
Achintha Wijesinghe, Songyang Zhang, Zhi Ding

Figure 1 for PS-FedGAN: An Efficient Federated Learning Framework Based on Partially Shared Generative Adversarial Networks For Data Privacy
Figure 2 for PS-FedGAN: An Efficient Federated Learning Framework Based on Partially Shared Generative Adversarial Networks For Data Privacy
Figure 3 for PS-FedGAN: An Efficient Federated Learning Framework Based on Partially Shared Generative Adversarial Networks For Data Privacy
Figure 4 for PS-FedGAN: An Efficient Federated Learning Framework Based on Partially Shared Generative Adversarial Networks For Data Privacy
Viaarxiv icon

TG-VQA: Ternary Game of Video Question Answering

Add code
Bookmark button
Alert button
May 18, 2023
Hao Li, Peng Jin, Zesen Cheng, Songyang Zhang, Kai Chen, Zhennan Wang, Chang Liu, Jie Chen

Figure 1 for TG-VQA: Ternary Game of Video Question Answering
Figure 2 for TG-VQA: Ternary Game of Video Question Answering
Figure 3 for TG-VQA: Ternary Game of Video Question Answering
Figure 4 for TG-VQA: Ternary Game of Video Question Answering
Viaarxiv icon

Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation

Add code
Bookmark button
Alert button
Apr 18, 2023
Jie An, Songyang Zhang, Harry Yang, Sonal Gupta, Jia-Bin Huang, Jiebo Luo, Xi Yin

Figure 1 for Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation
Figure 2 for Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation
Figure 3 for Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation
Figure 4 for Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation
Viaarxiv icon

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer

Add code
Bookmark button
Alert button
Apr 12, 2023
Jiahao Wang, Songyang Zhang, Yong Liu, Taiqiang Wu, Yujiu Yang, Xihui Liu, Kai Chen, Ping Luo, Dahua Lin

Figure 1 for RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer
Figure 2 for RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer
Figure 3 for RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer
Figure 4 for RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer
Viaarxiv icon

PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling

Add code
Bookmark button
Alert button
Mar 04, 2023
Yuan Liu, Songyang Zhang, Jiacheng Chen, Kai Chen, Dahua Lin

Figure 1 for PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
Figure 2 for PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
Figure 3 for PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
Figure 4 for PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
Viaarxiv icon