Picture for Jingze Shi

Jingze Shi

TransXSSM: A Hybrid Transformer State Space Model with Unified Rotary Position Embedding

Add code
Jun 12, 2025
Viaarxiv icon

Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting

Add code
May 26, 2025
Viaarxiv icon

Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

Add code
Dec 16, 2024
Viaarxiv icon

Cheems: Wonderful Matrices More Efficient and More Effective Architecture

Add code
Jul 25, 2024
Figure 1 for Cheems: Wonderful Matrices More Efficient and More Effective Architecture
Figure 2 for Cheems: Wonderful Matrices More Efficient and More Effective Architecture
Figure 3 for Cheems: Wonderful Matrices More Efficient and More Effective Architecture
Figure 4 for Cheems: Wonderful Matrices More Efficient and More Effective Architecture
Viaarxiv icon

OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser

Add code
Jun 25, 2024
Figure 1 for OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser
Figure 2 for OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser
Figure 3 for OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser
Figure 4 for OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser
Viaarxiv icon