Picture for Jinqiao Wang

Jinqiao Wang

Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences, objecteye.Inc

VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?

Add code
Jun 13, 2025
Viaarxiv icon

Understand, Think, and Answer: Advancing Visual Reasoning with Large Multimodal Models

Add code
May 27, 2025
Viaarxiv icon

MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection

Add code
Apr 17, 2025
Viaarxiv icon

PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability

Add code
Mar 13, 2025
Viaarxiv icon

LightPlanner: Unleashing the Reasoning Capabilities of Lightweight Large Language Models in Task Planning

Add code
Mar 11, 2025
Viaarxiv icon

Synthetic Data is an Elegant GIFT for Continual Vision-Language Models

Add code
Mar 06, 2025
Viaarxiv icon

FLARE: A Framework for Stellar Flare Forecasting using Stellar Physical Properties and Historical Records

Add code
Feb 25, 2025
Viaarxiv icon

A Benchmark for Crime Surveillance Video Analysis with Large Models

Add code
Feb 13, 2025
Viaarxiv icon

Systematic Outliers in Large Language Models

Add code
Feb 10, 2025
Viaarxiv icon

MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark

Add code
Jan 28, 2025
Viaarxiv icon