Picture for Wei Dai

Wei Dai

ViDove: A Translation Agent System with Multimodal Context and Memory-Augmented Reasoning

Add code
Jul 09, 2025
Viaarxiv icon

Da Yu: Towards USV-Based Image Captioning for Waterway Surveillance and Scene Understanding

Add code
Jun 24, 2025
Viaarxiv icon

Compositional Attribute Imbalance in Vision Datasets

Add code
Jun 17, 2025
Viaarxiv icon

PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts

Add code
Jun 06, 2025
Viaarxiv icon

CogMath: Assessing LLMs' Authentic Mathematical Ability from a Human Cognitive Perspective

Add code
Jun 04, 2025
Viaarxiv icon

CAPE: Context-Aware Prompt Perturbation Mechanism with Differential Privacy

Add code
May 09, 2025
Viaarxiv icon

NuExo: A Wearable Exoskeleton Covering all Upper Limb ROM for Outdoor Data Collection and Teleoperation of Humanoid Robots

Add code
Mar 13, 2025
Figure 1 for NuExo: A Wearable Exoskeleton Covering all Upper Limb ROM for Outdoor Data Collection and Teleoperation of Humanoid Robots
Figure 2 for NuExo: A Wearable Exoskeleton Covering all Upper Limb ROM for Outdoor Data Collection and Teleoperation of Humanoid Robots
Figure 3 for NuExo: A Wearable Exoskeleton Covering all Upper Limb ROM for Outdoor Data Collection and Teleoperation of Humanoid Robots
Figure 4 for NuExo: A Wearable Exoskeleton Covering all Upper Limb ROM for Outdoor Data Collection and Teleoperation of Humanoid Robots
Viaarxiv icon

Towards Fine-Grained Video Question Answering

Add code
Mar 10, 2025
Viaarxiv icon

Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning

Add code
Mar 09, 2025
Viaarxiv icon

Data Foundations for Large Scale Multimodal Clinical Foundation Models

Add code
Mar 09, 2025
Viaarxiv icon