Picture for Teng Wang

Teng Wang

ActionFlow: A Pipelined Action Acceleration for Vision Language Models on Edge

Add code
Dec 23, 2025
Viaarxiv icon

TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs

Add code
Dec 16, 2025
Viaarxiv icon

ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries

Add code
Nov 18, 2025
Viaarxiv icon

FLYINGTRUST: A Benchmark for Quadrotor Navigation Across Scenarios and Vehicles

Add code
Oct 30, 2025
Viaarxiv icon

UltraHiT: A Hierarchical Transformer Architecture for Generalizable Internal Carotid Artery Robotic Ultrasonography

Add code
Sep 17, 2025
Viaarxiv icon

Predicting person-level injury severity using crash narratives: A balanced approach with roadway classification and natural language process techniques

Add code
Sep 09, 2025
Viaarxiv icon

CVBench: Evaluating Cross-Video Synergies for Complex Multimodal Understanding and Reasoning

Add code
Aug 28, 2025
Viaarxiv icon

AudioStory: Generating Long-Form Narrative Audio with Large Language Models

Add code
Aug 27, 2025
Figure 1 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 2 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 3 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Figure 4 for AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Viaarxiv icon

ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts

Add code
Jul 28, 2025
Viaarxiv icon

SAGE: Strategy-Adaptive Generation Engine for Query Rewriting

Add code
Jun 24, 2025
Viaarxiv icon