Picture for Hongjie Zhang

Hongjie Zhang

Point or Line? Using Line-based Representation for Panoptic Symbol Spotting in CAD Drawings

Add code
May 29, 2025
Viaarxiv icon

Weakly Supervised Temporal Sentence Grounding via Positive Sample Mining

Add code
May 10, 2025
Viaarxiv icon

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Add code
Apr 15, 2025
Viaarxiv icon

ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting

Add code
Apr 02, 2025
Viaarxiv icon

Can LLM be a Good Path Planner based on Prompt Engineering? Mitigating the Hallucination for Path Planning

Add code
Aug 27, 2024
Viaarxiv icon

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World

Add code
Mar 24, 2024
Viaarxiv icon

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Add code
Mar 22, 2024
Viaarxiv icon

MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding

Add code
Dec 08, 2023
Viaarxiv icon

Multi-view Feature Extraction based on Triple Contrastive Heads

Add code
Mar 22, 2023
Viaarxiv icon

Multi-view Feature Extraction based on Dual Contrastive Head

Add code
Feb 08, 2023
Viaarxiv icon