Picture for Yan Sun

Yan Sun

Foundations of Top-$k$ Decoding For Language Models

Add code
May 25, 2025
Viaarxiv icon

Time Tracker: Mixture-of-Experts-Enhanced Foundation Time Series Forecasting Model with Decoupled Training Pipelines

Add code
May 21, 2025
Viaarxiv icon

An Analytical Characterization of Sloppiness in Neural Networks: Insights from Linear Models

Add code
May 13, 2025
Viaarxiv icon

Physics-Informed Inference Time Scaling via Simulation-Calibrated Scientific Machine Learning

Add code
Apr 22, 2025
Viaarxiv icon

A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models

Add code
Feb 22, 2025
Figure 1 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Figure 2 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Figure 3 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Figure 4 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Viaarxiv icon

Trustworthy Evaluation of Generative AI Models

Add code
Jan 31, 2025
Viaarxiv icon

TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs

Add code
Jan 31, 2025
Figure 1 for TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs
Figure 2 for TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs
Figure 3 for TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs
Figure 4 for TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs
Viaarxiv icon

MolGraph-xLSTM: A graph-based dual-level xLSTM framework with multi-head mixture-of-experts for enhanced molecular representation and interpretability

Add code
Jan 30, 2025
Viaarxiv icon

Unsupervised Domain Adaptation with Dynamic Clustering and Contrastive Refinement for Gait Recognition

Add code
Jan 28, 2025
Viaarxiv icon

FGATT: A Robust Framework for Wireless Data Imputation Using Fuzzy Graph Attention Networks and Transformer Encoders

Add code
Dec 02, 2024
Viaarxiv icon