Picture for Yue Yu

Yue Yu

Transformer learns the cross-task prior and regularization for in-context learning

Add code
May 17, 2025
Viaarxiv icon

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Add code
May 12, 2025
Viaarxiv icon

Monotone Peridynamic Neural Operator for Nonlinear Material Modeling with Conditionally Unique Solutions

Add code
May 02, 2025
Figure 1 for Monotone Peridynamic Neural Operator for Nonlinear Material Modeling with Conditionally Unique Solutions
Figure 2 for Monotone Peridynamic Neural Operator for Nonlinear Material Modeling with Conditionally Unique Solutions
Figure 3 for Monotone Peridynamic Neural Operator for Nonlinear Material Modeling with Conditionally Unique Solutions
Figure 4 for Monotone Peridynamic Neural Operator for Nonlinear Material Modeling with Conditionally Unique Solutions
Viaarxiv icon

Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration

Add code
Apr 07, 2025
Viaarxiv icon

RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation

Add code
Apr 04, 2025
Viaarxiv icon

Revealing the Implicit Noise-based Imprint of Generative Models

Add code
Mar 12, 2025
Viaarxiv icon

An optimal Petrov-Galerkin framework for operator networks

Add code
Mar 06, 2025
Viaarxiv icon

Accurate Expert Predictions in MoE Inference via Cross-Layer Gate

Add code
Feb 17, 2025
Figure 1 for Accurate Expert Predictions in MoE Inference via Cross-Layer Gate
Figure 2 for Accurate Expert Predictions in MoE Inference via Cross-Layer Gate
Figure 3 for Accurate Expert Predictions in MoE Inference via Cross-Layer Gate
Figure 4 for Accurate Expert Predictions in MoE Inference via Cross-Layer Gate
Viaarxiv icon

Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline

Add code
Feb 09, 2025
Viaarxiv icon

Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data

Add code
Jan 19, 2025
Figure 1 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Figure 2 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Figure 3 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Figure 4 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Viaarxiv icon