Picture for Yue Wu

Yue Wu

M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking

Add code
Dec 11, 2023
Figure 1 for M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking
Figure 2 for M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking
Figure 3 for M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking
Figure 4 for M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking
Viaarxiv icon

Drag-A-Video: Non-rigid Video Editing with Point-based Interaction

Add code
Dec 05, 2023
Figure 1 for Drag-A-Video: Non-rigid Video Editing with Point-based Interaction
Figure 2 for Drag-A-Video: Non-rigid Video Editing with Point-based Interaction
Figure 3 for Drag-A-Video: Non-rigid Video Editing with Point-based Interaction
Figure 4 for Drag-A-Video: Non-rigid Video Editing with Point-based Interaction
Viaarxiv icon

Language Grounded QFormer for Efficient Vision Language Understanding

Add code
Nov 13, 2023
Figure 1 for Language Grounded QFormer for Efficient Vision Language Understanding
Figure 2 for Language Grounded QFormer for Efficient Vision Language Understanding
Figure 3 for Language Grounded QFormer for Efficient Vision Language Understanding
Figure 4 for Language Grounded QFormer for Efficient Vision Language Understanding
Viaarxiv icon

Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models

Add code
Oct 23, 2023
Figure 1 for Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models
Figure 2 for Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models
Figure 3 for Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models
Figure 4 for Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models
Viaarxiv icon

PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Add code
Oct 16, 2023
Figure 1 for PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Figure 2 for PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Figure 3 for PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Figure 4 for PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Viaarxiv icon

SmartPlay : A Benchmark for LLMs as Intelligent Agents

Add code
Oct 04, 2023
Figure 1 for SmartPlay : A Benchmark for LLMs as Intelligent Agents
Figure 2 for SmartPlay : A Benchmark for LLMs as Intelligent Agents
Figure 3 for SmartPlay : A Benchmark for LLMs as Intelligent Agents
Figure 4 for SmartPlay : A Benchmark for LLMs as Intelligent Agents
Viaarxiv icon

Large Language Models Can Be Good Privacy Protection Learners

Add code
Oct 03, 2023
Figure 1 for Large Language Models Can Be Good Privacy Protection Learners
Figure 2 for Large Language Models Can Be Good Privacy Protection Learners
Figure 3 for Large Language Models Can Be Good Privacy Protection Learners
Figure 4 for Large Language Models Can Be Good Privacy Protection Learners
Viaarxiv icon

Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits

Add code
Oct 02, 2023
Figure 1 for Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
Viaarxiv icon

AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections

Add code
Sep 05, 2023
Figure 1 for AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections
Figure 2 for AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections
Figure 3 for AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections
Figure 4 for AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections
Viaarxiv icon

FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory

Add code
Aug 20, 2023
Figure 1 for FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory
Figure 2 for FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory
Figure 3 for FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory
Figure 4 for FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory
Viaarxiv icon