Picture for Chen Chen

Chen Chen

University of Central Florida, Institute of Artificial Intelligence, Orlando, FL, USA

AToken: A Unified Tokenizer for Vision

Add code
Sep 19, 2025
Viaarxiv icon

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

Add code
Sep 19, 2025
Figure 1 for MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Figure 2 for MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Figure 3 for MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Figure 4 for MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Viaarxiv icon

EvoEmpirBench: Dynamic Spatial Reasoning with Agent-ExpVer

Add code
Sep 16, 2025
Figure 1 for EvoEmpirBench: Dynamic Spatial Reasoning with Agent-ExpVer
Figure 2 for EvoEmpirBench: Dynamic Spatial Reasoning with Agent-ExpVer
Figure 3 for EvoEmpirBench: Dynamic Spatial Reasoning with Agent-ExpVer
Viaarxiv icon

Lethe: Purifying Backdoored Large Language Models with Knowledge Dilution

Add code
Aug 28, 2025
Viaarxiv icon

UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models

Add code
Aug 27, 2025
Figure 1 for UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models
Figure 2 for UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models
Figure 3 for UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models
Figure 4 for UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models
Viaarxiv icon

Seeing Further on the Shoulders of Giants: Knowledge Inheritance for Vision Foundation Models

Add code
Aug 20, 2025
Viaarxiv icon

FakeHunter: Multimodal Step-by-Step Reasoning for Explainable Video Forensics

Add code
Aug 20, 2025
Viaarxiv icon

UniSTFormer: Unified Spatio-Temporal Lightweight Transformer for Efficient Skeleton-Based Action Recognition

Add code
Aug 12, 2025
Figure 1 for UniSTFormer: Unified Spatio-Temporal Lightweight Transformer for Efficient Skeleton-Based Action Recognition
Figure 2 for UniSTFormer: Unified Spatio-Temporal Lightweight Transformer for Efficient Skeleton-Based Action Recognition
Figure 3 for UniSTFormer: Unified Spatio-Temporal Lightweight Transformer for Efficient Skeleton-Based Action Recognition
Figure 4 for UniSTFormer: Unified Spatio-Temporal Lightweight Transformer for Efficient Skeleton-Based Action Recognition
Viaarxiv icon

X2Edit: Revisiting Arbitrary-Instruction Image Editing through Self-Constructed Data and Task-Aware Representation Learning

Add code
Aug 11, 2025
Viaarxiv icon

ATR-UMMIM: A Benchmark Dataset for UAV-Based Multimodal Image Registration under Complex Imaging Conditions

Add code
Jul 28, 2025
Viaarxiv icon