Picture for Xiang Chen

Xiang Chen

Adobe Research

RoboGrasp: A Universal Grasping Policy for Robust Robotic Control

Add code
Feb 05, 2025
Figure 1 for RoboGrasp: A Universal Grasping Policy for Robust Robotic Control
Figure 2 for RoboGrasp: A Universal Grasping Policy for Robust Robotic Control
Figure 3 for RoboGrasp: A Universal Grasping Policy for Robust Robotic Control
Figure 4 for RoboGrasp: A Universal Grasping Policy for Robust Robotic Control
Viaarxiv icon

HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding

Add code
Jan 25, 2025
Figure 1 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Figure 2 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Figure 3 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Figure 4 for HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Viaarxiv icon

Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness

Add code
Jan 14, 2025
Figure 1 for Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness
Figure 2 for Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness
Figure 3 for Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness
Figure 4 for Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness
Viaarxiv icon

Data and System Perspectives of Sustainable Artificial Intelligence

Add code
Jan 13, 2025
Viaarxiv icon

LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding

Add code
Jan 09, 2025
Figure 1 for LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding
Figure 2 for LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding
Figure 3 for LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding
Figure 4 for LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding
Viaarxiv icon

Less is More: Towards Green Code Large Language Models via Unified Structural Pruning

Add code
Dec 20, 2024
Viaarxiv icon

Threshold Neuron: A Brain-inspired Artificial Neuron for Efficient On-device Inference

Add code
Dec 18, 2024
Viaarxiv icon

Numerical Pruning for Efficient Autoregressive Models

Add code
Dec 17, 2024
Figure 1 for Numerical Pruning for Efficient Autoregressive Models
Figure 2 for Numerical Pruning for Efficient Autoregressive Models
Figure 3 for Numerical Pruning for Efficient Autoregressive Models
Figure 4 for Numerical Pruning for Efficient Autoregressive Models
Viaarxiv icon

Depth-Centric Dehazing and Depth-Estimation from Real-World Hazy Driving Video

Add code
Dec 16, 2024
Figure 1 for Depth-Centric Dehazing and Depth-Estimation from Real-World Hazy Driving Video
Figure 2 for Depth-Centric Dehazing and Depth-Estimation from Real-World Hazy Driving Video
Figure 3 for Depth-Centric Dehazing and Depth-Estimation from Real-World Hazy Driving Video
Figure 4 for Depth-Centric Dehazing and Depth-Estimation from Real-World Hazy Driving Video
Viaarxiv icon

LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation

Add code
Dec 05, 2024
Figure 1 for LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation
Figure 2 for LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation
Figure 3 for LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation
Figure 4 for LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation
Viaarxiv icon