Picture for Wei Zhang

Wei Zhang

Alibaba Group

Real-time Stereo-based 3D Object Detection for Streaming Perception

Add code
Oct 16, 2024
Figure 1 for Real-time Stereo-based 3D Object Detection for Streaming Perception
Figure 2 for Real-time Stereo-based 3D Object Detection for Streaming Perception
Figure 3 for Real-time Stereo-based 3D Object Detection for Streaming Perception
Figure 4 for Real-time Stereo-based 3D Object Detection for Streaming Perception
Viaarxiv icon

Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking

Add code
Oct 11, 2024
Figure 1 for Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking
Figure 2 for Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking
Figure 3 for Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking
Figure 4 for Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking
Viaarxiv icon

The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Language Models

Add code
Oct 09, 2024
Viaarxiv icon

ERCache: An Efficient and Reliable Caching Framework for Large-Scale User Representations in Meta's Ads System

Add code
Oct 09, 2024
Figure 1 for ERCache: An Efficient and Reliable Caching Framework for Large-Scale User Representations in Meta's Ads System
Figure 2 for ERCache: An Efficient and Reliable Caching Framework for Large-Scale User Representations in Meta's Ads System
Figure 3 for ERCache: An Efficient and Reliable Caching Framework for Large-Scale User Representations in Meta's Ads System
Figure 4 for ERCache: An Efficient and Reliable Caching Framework for Large-Scale User Representations in Meta's Ads System
Viaarxiv icon

As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss

Add code
Oct 07, 2024
Figure 1 for As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
Figure 2 for As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
Figure 3 for As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
Figure 4 for As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
Viaarxiv icon

X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation

Add code
Sep 28, 2024
Figure 1 for X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation
Figure 2 for X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation
Figure 3 for X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation
Figure 4 for X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation
Viaarxiv icon

Evaluation of OpenAI o1: Opportunities and Challenges of AGI

Add code
Sep 27, 2024
Figure 1 for Evaluation of OpenAI o1: Opportunities and Challenges of AGI
Figure 2 for Evaluation of OpenAI o1: Opportunities and Challenges of AGI
Figure 3 for Evaluation of OpenAI o1: Opportunities and Challenges of AGI
Figure 4 for Evaluation of OpenAI o1: Opportunities and Challenges of AGI
Viaarxiv icon

General Compression Framework for Efficient Transformer Object Tracking

Add code
Sep 26, 2024
Figure 1 for General Compression Framework for Efficient Transformer Object Tracking
Figure 2 for General Compression Framework for Efficient Transformer Object Tracking
Figure 3 for General Compression Framework for Efficient Transformer Object Tracking
Figure 4 for General Compression Framework for Efficient Transformer Object Tracking
Viaarxiv icon

Dataset Distillation-based Hybrid Federated Learning on Non-IID Data

Add code
Sep 26, 2024
Viaarxiv icon

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Add code
Sep 26, 2024
Figure 1 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 2 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 3 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 4 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Viaarxiv icon