Alert button
Picture for Longteng Guo

Longteng Guo

Alert button

VL-Mamba: Exploring State Space Models for Multimodal Learning

Add code
Bookmark button
Alert button
Mar 20, 2024
Yanyuan Qiao, Zheng Yu, Longteng Guo, Sihan Chen, Zijia Zhao, Mingzhen Sun, Qi Wu, Jing Liu

Figure 1 for VL-Mamba: Exploring State Space Models for Multimodal Learning
Figure 2 for VL-Mamba: Exploring State Space Models for Multimodal Learning
Figure 3 for VL-Mamba: Exploring State Space Models for Multimodal Learning
Figure 4 for VL-Mamba: Exploring State Space Models for Multimodal Learning
Viaarxiv icon

SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models

Add code
Bookmark button
Alert button
Mar 20, 2024
Tongtian Yue, Jie Cheng, Longteng Guo, Xingyuan Dai, Zijia Zhao, Xingjian He, Gang Xiong, Yisheng Lv, Jing Liu

Figure 1 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Figure 2 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Figure 3 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Figure 4 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Viaarxiv icon

Knowledge Condensation and Reasoning for Knowledge-based VQA

Add code
Bookmark button
Alert button
Mar 15, 2024
Dongze Hao, Jian Jia, Longteng Guo, Qunbo Wang, Te Yang, Yan Li, Yanhua Cheng, Bo Wang, Quan Chen, Han Li, Jing Liu

Figure 1 for Knowledge Condensation and Reasoning for Knowledge-based VQA
Figure 2 for Knowledge Condensation and Reasoning for Knowledge-based VQA
Figure 3 for Knowledge Condensation and Reasoning for Knowledge-based VQA
Figure 4 for Knowledge Condensation and Reasoning for Knowledge-based VQA
Viaarxiv icon

Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation

Add code
Bookmark button
Alert button
Dec 13, 2023
Wenxuan Wang, Tongtian Yue, Yisi Zhang, Longteng Guo, Xingjian He, Xinlong Wang, Jing Liu

Viaarxiv icon

EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE

Add code
Bookmark button
Alert button
Aug 23, 2023
Junyi Chen, Longteng Guo, Jia Sun, Shuai Shao, Zehuan Yuan, Liang Lin, Dongyu Zhang

Figure 1 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Figure 2 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Figure 3 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Figure 4 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Viaarxiv icon

ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst

Add code
Bookmark button
Alert button
May 25, 2023
Zijia Zhao, Longteng Guo, Tongtian Yue, Sihan Chen, Shuai Shao, Xinxin Zhu, Zehuan Yuan, Jing Liu

Figure 1 for ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Figure 2 for ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Figure 3 for ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Figure 4 for ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Viaarxiv icon

Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner

Add code
Bookmark button
Alert button
May 19, 2023
Zikang Liu, Sihan Chen, Longteng Guo, Handong Li, Xingjian He, Jing Liu

Figure 1 for Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
Figure 2 for Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
Figure 3 for Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
Figure 4 for Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
Viaarxiv icon

VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

Add code
Bookmark button
Alert button
Apr 17, 2023
Sihan Chen, Xingjian He, Longteng Guo, Xinxin Zhu, Weining Wang, Jinhui Tang, Jing Liu

Figure 1 for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
Figure 2 for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
Figure 3 for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
Figure 4 for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
Viaarxiv icon

MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning

Add code
Bookmark button
Alert button
Oct 09, 2022
Zijia Zhao, Longteng Guo, Xingjian He, Shuai Shao, Zehuan Yuan, Jing Liu

Figure 1 for MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning
Figure 2 for MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning
Figure 3 for MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning
Figure 4 for MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning
Viaarxiv icon

OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation

Add code
Bookmark button
Alert button
Jul 06, 2021
Jing Liu, Xinxin Zhu, Fei Liu, Longteng Guo, Zijia Zhao, Mingzhen Sun, Weining Wang, Hanqing Lu, Shiyu Zhou, Jiajun Zhang, Jinqiao Wang

Figure 1 for OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Figure 2 for OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Figure 3 for OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Figure 4 for OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Viaarxiv icon