Picture for Hongsheng Li

Hongsheng Li

MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More

Add code
Oct 08, 2024
Figure 1 for MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More
Figure 2 for MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More
Figure 3 for MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More
Figure 4 for MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More
Viaarxiv icon

UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models

Add code
Sep 30, 2024
Figure 1 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 2 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 3 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 4 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Viaarxiv icon

MedViLaM: A multimodal large language model with advanced generalizability and explainability for medical data understanding and generation

Add code
Sep 29, 2024
Figure 1 for MedViLaM: A multimodal large language model with advanced generalizability and explainability for medical data understanding and generation
Figure 2 for MedViLaM: A multimodal large language model with advanced generalizability and explainability for medical data understanding and generation
Figure 3 for MedViLaM: A multimodal large language model with advanced generalizability and explainability for medical data understanding and generation
Figure 4 for MedViLaM: A multimodal large language model with advanced generalizability and explainability for medical data understanding and generation
Viaarxiv icon

SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation

Add code
Sep 26, 2024
Figure 1 for SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation
Figure 2 for SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation
Figure 3 for SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation
Figure 4 for SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation
Viaarxiv icon

LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation

Add code
Aug 28, 2024
Viaarxiv icon

GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars

Add code
Aug 24, 2024
Figure 1 for GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars
Figure 2 for GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars
Figure 3 for GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars
Figure 4 for GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars
Viaarxiv icon

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

Add code
Aug 05, 2024
Figure 1 for Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Figure 2 for Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Figure 3 for Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Figure 4 for Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Viaarxiv icon

MAVIS: Mathematical Visual Instruction Tuning

Add code
Jul 11, 2024
Figure 1 for MAVIS: Mathematical Visual Instruction Tuning
Figure 2 for MAVIS: Mathematical Visual Instruction Tuning
Figure 3 for MAVIS: Mathematical Visual Instruction Tuning
Figure 4 for MAVIS: Mathematical Visual Instruction Tuning
Viaarxiv icon

DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition

Add code
Jul 06, 2024
Figure 1 for DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Figure 2 for DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Figure 3 for DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Figure 4 for DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Viaarxiv icon

Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning

Add code
Jul 02, 2024
Figure 1 for Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning
Figure 2 for Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning
Figure 3 for Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning
Figure 4 for Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning
Viaarxiv icon