Picture for Sucheng Ren

Sucheng Ren

Causal Image Modeling for Efficient Visual Understanding

Add code
Oct 10, 2024
Figure 1 for Causal Image Modeling for Efficient Visual Understanding
Figure 2 for Causal Image Modeling for Efficient Visual Understanding
Figure 3 for Causal Image Modeling for Efficient Visual Understanding
Figure 4 for Causal Image Modeling for Efficient Visual Understanding
Viaarxiv icon

What If We Recaption Billions of Web Images with LLaMA-3?

Add code
Jun 12, 2024
Figure 1 for What If We Recaption Billions of Web Images with LLaMA-3?
Figure 2 for What If We Recaption Billions of Web Images with LLaMA-3?
Figure 3 for What If We Recaption Billions of Web Images with LLaMA-3?
Figure 4 for What If We Recaption Billions of Web Images with LLaMA-3?
Viaarxiv icon

Autoregressive Pretraining with Mamba in Vision

Add code
Jun 11, 2024
Figure 1 for Autoregressive Pretraining with Mamba in Vision
Figure 2 for Autoregressive Pretraining with Mamba in Vision
Figure 3 for Autoregressive Pretraining with Mamba in Vision
Figure 4 for Autoregressive Pretraining with Mamba in Vision
Viaarxiv icon

Medical Vision Generalist: Unifying Medical Imaging Tasks in Context

Add code
Jun 08, 2024
Figure 1 for Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
Figure 2 for Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
Figure 3 for Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
Figure 4 for Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
Viaarxiv icon

ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning

Add code
May 24, 2024
Figure 1 for ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Figure 2 for ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Figure 3 for ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Figure 4 for ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Viaarxiv icon

Mamba-R: Vision Mamba ALSO Needs Registers

Add code
May 23, 2024
Figure 1 for Mamba-R: Vision Mamba ALSO Needs Registers
Figure 2 for Mamba-R: Vision Mamba ALSO Needs Registers
Figure 3 for Mamba-R: Vision Mamba ALSO Needs Registers
Figure 4 for Mamba-R: Vision Mamba ALSO Needs Registers
Viaarxiv icon

Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapolation

Add code
Mar 11, 2024
Viaarxiv icon

Compress & Align: Curating Image-Text Data with Human Knowledge

Add code
Dec 13, 2023
Figure 1 for Compress & Align: Curating Image-Text Data with Human Knowledge
Figure 2 for Compress & Align: Curating Image-Text Data with Human Knowledge
Figure 3 for Compress & Align: Curating Image-Text Data with Human Knowledge
Figure 4 for Compress & Align: Curating Image-Text Data with Human Knowledge
Viaarxiv icon

Rejuvenating image-GPT as Strong Visual Representation Learners

Add code
Dec 04, 2023
Figure 1 for Rejuvenating image-GPT as Strong Visual Representation Learners
Figure 2 for Rejuvenating image-GPT as Strong Visual Representation Learners
Figure 3 for Rejuvenating image-GPT as Strong Visual Representation Learners
Figure 4 for Rejuvenating image-GPT as Strong Visual Representation Learners
Viaarxiv icon

NPF-200: A Multi-Modal Eye Fixation Dataset and Method for Non-Photorealistic Videos

Add code
Aug 23, 2023
Viaarxiv icon