Picture for Bo Zhang

Bo Zhang

An Event-Oriented Diffusion-Refinement Method for Sparse Events Completion

Add code
Jan 06, 2024
Viaarxiv icon

MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices

Add code
Dec 30, 2023
Figure 1 for MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
Figure 2 for MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
Figure 3 for MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
Figure 4 for MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
Viaarxiv icon

Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction

Add code
Dec 21, 2023
Figure 1 for Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction
Figure 2 for Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction
Figure 3 for Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction
Figure 4 for Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction
Viaarxiv icon

Towards Knowledge-driven Autonomous Driving

Add code
Dec 12, 2023
Figure 1 for Towards Knowledge-driven Autonomous Driving
Figure 2 for Towards Knowledge-driven Autonomous Driving
Figure 3 for Towards Knowledge-driven Autonomous Driving
Figure 4 for Towards Knowledge-driven Autonomous Driving
Viaarxiv icon

Transferring CLIP's Knowledge into Zero-Shot Point Cloud Semantic Segmentation

Add code
Dec 12, 2023
Figure 1 for Transferring CLIP's Knowledge into Zero-Shot Point Cloud Semantic Segmentation
Figure 2 for Transferring CLIP's Knowledge into Zero-Shot Point Cloud Semantic Segmentation
Figure 3 for Transferring CLIP's Knowledge into Zero-Shot Point Cloud Semantic Segmentation
Figure 4 for Transferring CLIP's Knowledge into Zero-Shot Point Cloud Semantic Segmentation
Viaarxiv icon

Lenna: Language Enhanced Reasoning Detection Assistant

Add code
Dec 05, 2023
Figure 1 for Lenna: Language Enhanced Reasoning Detection Assistant
Figure 2 for Lenna: Language Enhanced Reasoning Detection Assistant
Figure 3 for Lenna: Language Enhanced Reasoning Detection Assistant
Figure 4 for Lenna: Language Enhanced Reasoning Detection Assistant
Viaarxiv icon

Masked Autoencoders Are Robust Neural Architecture Search Learners

Add code
Nov 20, 2023
Figure 1 for Masked Autoencoders Are Robust Neural Architecture Search Learners
Figure 2 for Masked Autoencoders Are Robust Neural Architecture Search Learners
Figure 3 for Masked Autoencoders Are Robust Neural Architecture Search Learners
Figure 4 for Masked Autoencoders Are Robust Neural Architecture Search Learners
Viaarxiv icon

A Speed Odyssey for Deployable Quantization of LLMs

Add code
Nov 16, 2023
Figure 1 for A Speed Odyssey for Deployable Quantization of LLMs
Figure 2 for A Speed Odyssey for Deployable Quantization of LLMs
Figure 3 for A Speed Odyssey for Deployable Quantization of LLMs
Figure 4 for A Speed Odyssey for Deployable Quantization of LLMs
Viaarxiv icon

MultiSPANS: A Multi-range Spatial-Temporal Transformer Network for Traffic Forecast via Structural Entropy Optimization

Add code
Nov 06, 2023
Figure 1 for MultiSPANS: A Multi-range Spatial-Temporal Transformer Network for Traffic Forecast via Structural Entropy Optimization
Figure 2 for MultiSPANS: A Multi-range Spatial-Temporal Transformer Network for Traffic Forecast via Structural Entropy Optimization
Figure 3 for MultiSPANS: A Multi-range Spatial-Temporal Transformer Network for Traffic Forecast via Structural Entropy Optimization
Figure 4 for MultiSPANS: A Multi-range Spatial-Temporal Transformer Network for Traffic Forecast via Structural Entropy Optimization
Viaarxiv icon

A Transformer-Based Model With Self-Distillation for Multimodal Emotion Recognition in Conversations

Add code
Oct 31, 2023
Figure 1 for A Transformer-Based Model With Self-Distillation for Multimodal Emotion Recognition in Conversations
Figure 2 for A Transformer-Based Model With Self-Distillation for Multimodal Emotion Recognition in Conversations
Figure 3 for A Transformer-Based Model With Self-Distillation for Multimodal Emotion Recognition in Conversations
Figure 4 for A Transformer-Based Model With Self-Distillation for Multimodal Emotion Recognition in Conversations
Viaarxiv icon