Picture for Zhiqiang Shen

Zhiqiang Shen

Northeastern University, Shenyang, China, Key Laboratory of Intelligent Computing in Medical Image, Shenyang, China

Pruning Spurious Subgraphs for Graph Out-of-Distribtuion Generalization

Add code
Jun 06, 2025
Viaarxiv icon

VideoMolmo: Spatio-Temporal Grounding Meets Pointing

Add code
Jun 05, 2025
Figure 1 for VideoMolmo: Spatio-Temporal Grounding Meets Pointing
Figure 2 for VideoMolmo: Spatio-Temporal Grounding Meets Pointing
Figure 3 for VideoMolmo: Spatio-Temporal Grounding Meets Pointing
Figure 4 for VideoMolmo: Spatio-Temporal Grounding Meets Pointing
Viaarxiv icon

Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents

Add code
May 30, 2025
Figure 1 for Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Figure 2 for Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Figure 3 for Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Figure 4 for Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Viaarxiv icon

Time Blindness: Why Video-Language Models Can't See What Humans Can?

Add code
May 30, 2025
Viaarxiv icon

Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos

Add code
Apr 07, 2025
Figure 1 for Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos
Figure 2 for Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos
Figure 3 for Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos
Figure 4 for Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos
Viaarxiv icon

Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark

Add code
Mar 26, 2025
Figure 1 for Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark
Figure 2 for Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark
Figure 3 for Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark
Figure 4 for Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark
Viaarxiv icon

A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1

Add code
Mar 13, 2025
Viaarxiv icon

Style Content Decomposition-based Data Augmentation for Domain Generalizable Medical Image Segmentation

Add code
Feb 28, 2025
Viaarxiv icon

KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding

Add code
Feb 20, 2025
Figure 1 for KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Figure 2 for KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Figure 3 for KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Figure 4 for KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Viaarxiv icon

DarwinLM: Evolutionary Structured Pruning of Large Language Models

Add code
Feb 11, 2025
Figure 1 for DarwinLM: Evolutionary Structured Pruning of Large Language Models
Figure 2 for DarwinLM: Evolutionary Structured Pruning of Large Language Models
Figure 3 for DarwinLM: Evolutionary Structured Pruning of Large Language Models
Figure 4 for DarwinLM: Evolutionary Structured Pruning of Large Language Models
Viaarxiv icon