Picture for Jiwen Lu

Jiwen Lu

Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models

Add code
Mar 21, 2024
Figure 1 for Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models
Figure 2 for Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models
Figure 3 for Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models
Figure 4 for Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models
Viaarxiv icon

Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution

Add code
Mar 16, 2024
Figure 1 for Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution
Figure 2 for Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution
Figure 3 for Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution
Figure 4 for Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution
Viaarxiv icon

ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation

Add code
Mar 13, 2024
Viaarxiv icon

Memory-based Adapters for Online 3D Scene Perception

Add code
Mar 11, 2024
Viaarxiv icon

MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer

Add code
Mar 05, 2024
Figure 1 for MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
Figure 2 for MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
Figure 3 for MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
Figure 4 for MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
Viaarxiv icon

Path Choice Matters for Clear Attribution in Path Methods

Add code
Jan 19, 2024
Figure 1 for Path Choice Matters for Clear Attribution in Path Methods
Figure 2 for Path Choice Matters for Clear Attribution in Path Methods
Figure 3 for Path Choice Matters for Clear Attribution in Path Methods
Figure 4 for Path Choice Matters for Clear Attribution in Path Methods
Viaarxiv icon

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Add code
Jan 18, 2024
Viaarxiv icon

ThinkBot: Embodied Instruction Following with Thought Chain Reasoning

Add code
Dec 14, 2023
Viaarxiv icon

OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields

Add code
Dec 14, 2023
Figure 1 for OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields
Figure 2 for OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields
Figure 3 for OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields
Figure 4 for OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields
Viaarxiv icon

Segment and Caption Anything

Add code
Dec 01, 2023
Figure 1 for Segment and Caption Anything
Figure 2 for Segment and Caption Anything
Figure 3 for Segment and Caption Anything
Figure 4 for Segment and Caption Anything
Viaarxiv icon