Picture for Renrui Zhang

Renrui Zhang

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Add code
Jan 23, 2025
Figure 1 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Figure 2 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Figure 3 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Figure 4 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Viaarxiv icon

TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction

Add code
Dec 22, 2024
Figure 1 for TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Figure 2 for TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Figure 3 for TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Figure 4 for TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Viaarxiv icon

Chimera: Improving Generalist Model with Domain-Specific Experts

Add code
Dec 08, 2024
Viaarxiv icon

Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation

Add code
Nov 27, 2024
Figure 1 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 2 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 3 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 4 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Viaarxiv icon

Point Cloud Understanding via Attention-Driven Contrastive Learning

Add code
Nov 22, 2024
Figure 1 for Point Cloud Understanding via Attention-Driven Contrastive Learning
Figure 2 for Point Cloud Understanding via Attention-Driven Contrastive Learning
Figure 3 for Point Cloud Understanding via Attention-Driven Contrastive Learning
Figure 4 for Point Cloud Understanding via Attention-Driven Contrastive Learning
Viaarxiv icon

Training-free Regional Prompting for Diffusion Transformers

Add code
Nov 04, 2024
Viaarxiv icon

CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection

Add code
Oct 10, 2024
Figure 1 for CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection
Figure 2 for CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection
Figure 3 for CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection
Figure 4 for CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection
Viaarxiv icon

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

Add code
Aug 29, 2024
Figure 1 for SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners
Figure 2 for SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners
Figure 3 for SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners
Figure 4 for SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners
Viaarxiv icon

LLaVA-OneVision: Easy Visual Task Transfer

Add code
Aug 06, 2024
Figure 1 for LLaVA-OneVision: Easy Visual Task Transfer
Figure 2 for LLaVA-OneVision: Easy Visual Task Transfer
Figure 3 for LLaVA-OneVision: Easy Visual Task Transfer
Figure 4 for LLaVA-OneVision: Easy Visual Task Transfer
Viaarxiv icon

MAVIS: Mathematical Visual Instruction Tuning

Add code
Jul 11, 2024
Figure 1 for MAVIS: Mathematical Visual Instruction Tuning
Figure 2 for MAVIS: Mathematical Visual Instruction Tuning
Figure 3 for MAVIS: Mathematical Visual Instruction Tuning
Figure 4 for MAVIS: Mathematical Visual Instruction Tuning
Viaarxiv icon