Picture for Haotian Zhang

Haotian Zhang

Tiny-WiFo: A Lightweight Wireless Foundation Model for Channel Prediction via Multi-Component Adaptive Knowledge Distillation

Add code
Nov 06, 2025
Viaarxiv icon

Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents

Add code
Sep 30, 2025
Figure 1 for Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents
Figure 2 for Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents
Figure 3 for Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents
Figure 4 for Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents
Viaarxiv icon

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

Add code
Sep 19, 2025
Figure 1 for MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Figure 2 for MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Figure 3 for MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Figure 4 for MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Viaarxiv icon

FoBa: A Foreground-Background co-Guided Method and New Benchmark for Remote Sensing Semantic Change Detection

Add code
Sep 19, 2025
Figure 1 for FoBa: A Foreground-Background co-Guided Method and New Benchmark for Remote Sensing Semantic Change Detection
Figure 2 for FoBa: A Foreground-Background co-Guided Method and New Benchmark for Remote Sensing Semantic Change Detection
Figure 3 for FoBa: A Foreground-Background co-Guided Method and New Benchmark for Remote Sensing Semantic Change Detection
Figure 4 for FoBa: A Foreground-Background co-Guided Method and New Benchmark for Remote Sensing Semantic Change Detection
Viaarxiv icon

Scaling Learned Image Compression Models up to 1 Billion

Add code
Aug 12, 2025
Viaarxiv icon

LARGO: Low-Rank Regulated Gradient Projection for Robust Parameter Efficient Fine-Tuning

Add code
Jun 14, 2025
Figure 1 for LARGO: Low-Rank Regulated Gradient Projection for Robust Parameter Efficient Fine-Tuning
Figure 2 for LARGO: Low-Rank Regulated Gradient Projection for Robust Parameter Efficient Fine-Tuning
Figure 3 for LARGO: Low-Rank Regulated Gradient Projection for Robust Parameter Efficient Fine-Tuning
Figure 4 for LARGO: Low-Rank Regulated Gradient Projection for Robust Parameter Efficient Fine-Tuning
Viaarxiv icon

Synesthesia of Machines (SoM)-Aided Online FDD Precoding via Heterogeneous Multi-Modal Sensing: A Vertical Federated Learning Approach

Add code
Jun 09, 2025
Viaarxiv icon

SIGMA: Refining Large Language Model Reasoning via Sibling-Guided Monte Carlo Augmentation

Add code
Jun 06, 2025
Figure 1 for SIGMA: Refining Large Language Model Reasoning via Sibling-Guided Monte Carlo Augmentation
Figure 2 for SIGMA: Refining Large Language Model Reasoning via Sibling-Guided Monte Carlo Augmentation
Figure 3 for SIGMA: Refining Large Language Model Reasoning via Sibling-Guided Monte Carlo Augmentation
Figure 4 for SIGMA: Refining Large Language Model Reasoning via Sibling-Guided Monte Carlo Augmentation
Viaarxiv icon

Rendering-Aware Reinforcement Learning for Vector Graphics Generation

Add code
May 27, 2025
Figure 1 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Figure 2 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Figure 3 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Figure 4 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Viaarxiv icon

GENMO: A GENeralist Model for Human MOtion

Add code
May 02, 2025
Figure 1 for GENMO: A GENeralist Model for Human MOtion
Figure 2 for GENMO: A GENeralist Model for Human MOtion
Figure 3 for GENMO: A GENeralist Model for Human MOtion
Figure 4 for GENMO: A GENeralist Model for Human MOtion
Viaarxiv icon