Picture for Shoufa Chen

Shoufa Chen

MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents

Add code
Jun 12, 2024
Figure 1 for MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents
Figure 2 for MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents
Figure 3 for MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents
Figure 4 for MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents
Viaarxiv icon

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Add code
Jun 10, 2024
Figure 1 for Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Figure 2 for Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Figure 3 for Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Figure 4 for Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Viaarxiv icon

RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis

Add code
Feb 25, 2024
Figure 1 for RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
Figure 2 for RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
Figure 3 for RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
Figure 4 for RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
Viaarxiv icon

GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation

Add code
Dec 07, 2023
Figure 1 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Figure 2 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Figure 3 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Figure 4 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Viaarxiv icon

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing

Add code
Oct 09, 2023
Figure 1 for FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Figure 2 for FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Figure 3 for FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Figure 4 for FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Viaarxiv icon

Enhancing Your Trained DETRs with Box Refinement

Add code
Jul 21, 2023
Figure 1 for Enhancing Your Trained DETRs with Box Refinement
Figure 2 for Enhancing Your Trained DETRs with Box Refinement
Figure 3 for Enhancing Your Trained DETRs with Box Refinement
Figure 4 for Enhancing Your Trained DETRs with Box Refinement
Viaarxiv icon

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Add code
Jul 07, 2023
Figure 1 for GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Figure 2 for GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Figure 3 for GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Figure 4 for GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Viaarxiv icon

Going Denser with Open-Vocabulary Part Segmentation

Add code
May 18, 2023
Figure 1 for Going Denser with Open-Vocabulary Part Segmentation
Figure 2 for Going Denser with Open-Vocabulary Part Segmentation
Figure 3 for Going Denser with Open-Vocabulary Part Segmentation
Figure 4 for Going Denser with Open-Vocabulary Part Segmentation
Viaarxiv icon

InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language

Add code
May 11, 2023
Figure 1 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 2 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 3 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 4 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Viaarxiv icon

Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning

Add code
Mar 30, 2023
Figure 1 for Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning
Figure 2 for Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning
Figure 3 for Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning
Figure 4 for Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning
Viaarxiv icon