Picture for Da-shan Shiu

Da-shan Shiu

Revisiting the Shape Convention of Transformer Language Models

Add code
Feb 06, 2026
Viaarxiv icon

Rethinking the shape convention of an MLP

Add code
Oct 02, 2025
Viaarxiv icon

Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds

Add code
May 29, 2025
Viaarxiv icon

Towards a Foundation Model for Communication Systems

Add code
May 20, 2025
Figure 1 for Towards a Foundation Model for Communication Systems
Figure 2 for Towards a Foundation Model for Communication Systems
Figure 3 for Towards a Foundation Model for Communication Systems
Figure 4 for Towards a Foundation Model for Communication Systems
Viaarxiv icon

Latent Flow Transformer

Add code
May 20, 2025
Viaarxiv icon

Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity

Add code
May 16, 2025
Viaarxiv icon

Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation

Add code
Dec 02, 2024
Viaarxiv icon

Exact, Tractable Gauss-Newton Optimization in Deep Reversible Architectures Reveal Poor Generalization

Add code
Nov 13, 2024
Figure 1 for Exact, Tractable Gauss-Newton Optimization in Deep Reversible Architectures Reveal Poor Generalization
Figure 2 for Exact, Tractable Gauss-Newton Optimization in Deep Reversible Architectures Reveal Poor Generalization
Figure 3 for Exact, Tractable Gauss-Newton Optimization in Deep Reversible Architectures Reveal Poor Generalization
Figure 4 for Exact, Tractable Gauss-Newton Optimization in Deep Reversible Architectures Reveal Poor Generalization
Viaarxiv icon

Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition

Add code
May 23, 2024
Viaarxiv icon

Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite

Add code
Oct 02, 2023
Figure 1 for Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite
Figure 2 for Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite
Figure 3 for Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite
Viaarxiv icon