Alert button
Picture for Shijie Geng

Shijie Geng

Alert button

SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models

Add code
Bookmark button
Alert button
Feb 08, 2024
Peng Gao, Renrui Zhang, Chris Liu, Longtian Qiu, Siyuan Huang, Weifeng Lin, Shitian Zhao, Shijie Geng, Ziyi Lin, Peng Jin, Kaipeng Zhang, Wenqi Shao, Chao Xu, Conghui He, Junjun He, Hao Shao, Pan Lu, Hongsheng Li, Yu Qiao

Viaarxiv icon

Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

Add code
Bookmark button
Alert button
Sep 27, 2023
Haonan Chang, Kowndinya Boyalakuntla, Shiyang Lu, Siwei Cai, Eric Jing, Shreesh Keskar, Shijie Geng, Adeeb Abbas, Lifeng Zhou, Kostas Bekris, Abdeslam Boularias

Viaarxiv icon

VIP5: Towards Multimodal Foundation Models for Recommendation

Add code
Bookmark button
Alert button
May 23, 2023
Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang

Figure 1 for VIP5: Towards Multimodal Foundation Models for Recommendation
Figure 2 for VIP5: Towards Multimodal Foundation Models for Recommendation
Figure 3 for VIP5: Towards Multimodal Foundation Models for Recommendation
Figure 4 for VIP5: Towards Multimodal Foundation Models for Recommendation
Viaarxiv icon

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

Add code
Bookmark button
Alert button
Apr 28, 2023
Peng Gao, Jiaming Han, Renrui Zhang, Ziyi Lin, Shijie Geng, Aojun Zhou, Wei Zhang, Pan Lu, Conghui He, Xiangyu Yue, Hongsheng Li, Yu Qiao

Figure 1 for LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
Figure 2 for LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
Figure 3 for LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
Figure 4 for LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
Viaarxiv icon

Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens

Add code
Bookmark button
Alert button
Mar 27, 2023
Yuxiao Chen, Jianbo Yuan, Yu Tian, Shijie Geng, Xinyu Li, Ding Zhou, Dimitris N. Metaxas, Hongxia Yang

Figure 1 for Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
Figure 2 for Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
Figure 3 for Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
Figure 4 for Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
Viaarxiv icon

HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention

Add code
Bookmark button
Alert button
Mar 06, 2023
Shijie Geng, Jianbo Yuan, Yu Tian, Yuxiao Chen, Yongfeng Zhang

Figure 1 for HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Figure 2 for HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Figure 3 for HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Figure 4 for HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Viaarxiv icon

Mono-STAR: Mono-camera Scene-level Tracking and Reconstruction

Add code
Bookmark button
Alert button
Jan 30, 2023
Haonan Chang, Dhruv Metha Ramesh, Shijie Geng, Yuqiu Gan, Abdeslam Boularias

Figure 1 for Mono-STAR: Mono-camera Scene-level Tracking and Reconstruction
Figure 2 for Mono-STAR: Mono-camera Scene-level Tracking and Reconstruction
Figure 3 for Mono-STAR: Mono-camera Scene-level Tracking and Reconstruction
Figure 4 for Mono-STAR: Mono-camera Scene-level Tracking and Reconstruction
Viaarxiv icon

Frozen CLIP Models are Efficient Video Learners

Add code
Bookmark button
Alert button
Aug 06, 2022
Ziyi Lin, Shijie Geng, Renrui Zhang, Peng Gao, Gerard de Melo, Xiaogang Wang, Jifeng Dai, Yu Qiao, Hongsheng Li

Figure 1 for Frozen CLIP Models are Efficient Video Learners
Figure 2 for Frozen CLIP Models are Efficient Video Learners
Figure 3 for Frozen CLIP Models are Efficient Video Learners
Figure 4 for Frozen CLIP Models are Efficient Video Learners
Viaarxiv icon

Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning

Add code
Bookmark button
Alert button
Jul 20, 2022
Yuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Ligong Han, Dimitris N. Metaxas

Figure 1 for Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning
Viaarxiv icon

Explainable Fairness in Recommendation

Add code
Bookmark button
Alert button
Apr 24, 2022
Yingqiang Ge, Juntao Tan, Yan Zhu, Yinglong Xia, Jiebo Luo, Shuchang Liu, Zuohui Fu, Shijie Geng, Zelong Li, Yongfeng Zhang

Figure 1 for Explainable Fairness in Recommendation
Figure 2 for Explainable Fairness in Recommendation
Figure 3 for Explainable Fairness in Recommendation
Figure 4 for Explainable Fairness in Recommendation
Viaarxiv icon