Picture for Ziyang Chen

Ziyang Chen

Enjoying Information Dividend: Gaze Track-based Medical Weakly Supervised Segmentation

Add code
May 28, 2025
Viaarxiv icon

VISTA: Enhancing Vision-Text Alignment in MLLMs via Cross-Modal Mutual Information Maximization

Add code
May 19, 2025
Viaarxiv icon

Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding

Add code
Mar 17, 2025
Viaarxiv icon

A Survey of fMRI to Image Reconstruction

Add code
Feb 24, 2025
Viaarxiv icon

From Visuals to Vocabulary: Establishing Equivalence Between Image and Text Token Through Autoregressive Pre-training in MLLMs

Add code
Feb 13, 2025
Viaarxiv icon

Test Time Training for 4D Medical Image Interpolation

Add code
Feb 04, 2025
Figure 1 for Test Time Training for 4D Medical Image Interpolation
Figure 2 for Test Time Training for 4D Medical Image Interpolation
Figure 3 for Test Time Training for 4D Medical Image Interpolation
Figure 4 for Test Time Training for 4D Medical Image Interpolation
Viaarxiv icon

GPS as a Control Signal for Image Generation

Add code
Jan 21, 2025
Viaarxiv icon

Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding

Add code
Jan 19, 2025
Viaarxiv icon

Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer

Add code
Jan 02, 2025
Viaarxiv icon

Meta Curvature-Aware Minimization for Domain Generalization

Add code
Dec 16, 2024
Viaarxiv icon