Picture for Ang Lv

Ang Lv

The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason

Add code
May 28, 2025
Viaarxiv icon

Divide-Fuse-Conquer: Eliciting "Aha Moments" in Multi-Scenario Games

Add code
May 22, 2025
Viaarxiv icon

More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives

Add code
Jan 07, 2025
Viaarxiv icon

More Expressive Attention with Negative Weights

Add code
Nov 14, 2024
Viaarxiv icon

HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation

Add code
Oct 28, 2024
Figure 1 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 2 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 3 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 4 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Viaarxiv icon

PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead

Add code
Sep 29, 2024
Figure 1 for PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
Figure 2 for PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
Figure 3 for PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
Figure 4 for PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
Viaarxiv icon

Language Models "Grok" to Copy

Add code
Sep 14, 2024
Viaarxiv icon

Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules

Add code
Jul 09, 2024
Figure 1 for Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Figure 2 for Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Figure 3 for Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Figure 4 for Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Viaarxiv icon

Mixture of In-Context Experts Enhance LLMs' Long Context Awareness

Add code
Jun 28, 2024
Figure 1 for Mixture of In-Context Experts Enhance LLMs' Long Context Awareness
Figure 2 for Mixture of In-Context Experts Enhance LLMs' Long Context Awareness
Figure 3 for Mixture of In-Context Experts Enhance LLMs' Long Context Awareness
Figure 4 for Mixture of In-Context Experts Enhance LLMs' Long Context Awareness
Viaarxiv icon

Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models

Add code
Apr 09, 2024
Viaarxiv icon