Picture for Yitian Zhang

Yitian Zhang

Stephen

The Indra Representation Hypothesis for Multimodal Alignment

Add code
Apr 06, 2026
Viaarxiv icon

Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks

Add code
Feb 27, 2026
Viaarxiv icon

Fine-T2I: An Open, Large-Scale, and Diverse Dataset for High-Quality T2I Fine-Tuning

Add code
Feb 10, 2026
Viaarxiv icon

CompSRT: Quantization and Pruning for Image Super Resolution Transformers

Add code
Jan 28, 2026
Viaarxiv icon

MTS-DMAE: Dual-Masked Autoencoder for Unsupervised Multivariate Time Series Representation Learning

Add code
Sep 19, 2025
Figure 1 for MTS-DMAE: Dual-Masked Autoencoder for Unsupervised Multivariate Time Series Representation Learning
Figure 2 for MTS-DMAE: Dual-Masked Autoencoder for Unsupervised Multivariate Time Series Representation Learning
Figure 3 for MTS-DMAE: Dual-Masked Autoencoder for Unsupervised Multivariate Time Series Representation Learning
Figure 4 for MTS-DMAE: Dual-Masked Autoencoder for Unsupervised Multivariate Time Series Representation Learning
Viaarxiv icon

SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting

Add code
Jun 17, 2025
Figure 1 for SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting
Figure 2 for SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting
Figure 3 for SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting
Figure 4 for SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting
Viaarxiv icon

S-Crescendo: A Nested Transformer Weaving Framework for Scalable Nonlinear System in S-Domain Representation

Add code
May 17, 2025
Viaarxiv icon

Fusing Global and Local: Transformer-CNN Synergy for Next-Gen Current Estimation

Add code
Apr 08, 2025
Viaarxiv icon

GmNet: Revisiting Gating Mechanisms From A Frequency View

Add code
Mar 28, 2025
Figure 1 for GmNet: Revisiting Gating Mechanisms From A Frequency View
Figure 2 for GmNet: Revisiting Gating Mechanisms From A Frequency View
Figure 3 for GmNet: Revisiting Gating Mechanisms From A Frequency View
Figure 4 for GmNet: Revisiting Gating Mechanisms From A Frequency View
Viaarxiv icon

REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder

Add code
Mar 11, 2025
Figure 1 for REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder
Figure 2 for REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder
Figure 3 for REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder
Figure 4 for REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder
Viaarxiv icon