Picture for Kun Yuan

Kun Yuan

Breaking Memory Limits: Gradient Wavelet Transform Enhances LLMs Training

Add code
Jan 13, 2025
Figure 1 for Breaking Memory Limits: Gradient Wavelet Transform Enhances LLMs Training
Figure 2 for Breaking Memory Limits: Gradient Wavelet Transform Enhances LLMs Training
Figure 3 for Breaking Memory Limits: Gradient Wavelet Transform Enhances LLMs Training
Figure 4 for Breaking Memory Limits: Gradient Wavelet Transform Enhances LLMs Training
Viaarxiv icon

OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining

Add code
Nov 23, 2024
Figure 1 for OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining
Figure 2 for OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining
Figure 3 for OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining
Figure 4 for OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining
Viaarxiv icon

SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization

Add code
Nov 21, 2024
Figure 1 for SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
Figure 2 for SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
Figure 3 for SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
Figure 4 for SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
Viaarxiv icon

Gradient Normalization with(out) Clipping Ensures Convergence of Nonconvex SGD under Heavy-Tailed Noise with Improved Results

Add code
Oct 21, 2024
Figure 1 for Gradient Normalization with(out) Clipping Ensures Convergence of Nonconvex SGD under Heavy-Tailed Noise with Improved Results
Figure 2 for Gradient Normalization with(out) Clipping Ensures Convergence of Nonconvex SGD under Heavy-Tailed Noise with Improved Results
Viaarxiv icon

Subspace Optimization for Large Language Models with Convergence Guarantees

Add code
Oct 15, 2024
Figure 1 for Subspace Optimization for Large Language Models with Convergence Guarantees
Figure 2 for Subspace Optimization for Large Language Models with Convergence Guarantees
Figure 3 for Subspace Optimization for Large Language Models with Convergence Guarantees
Figure 4 for Subspace Optimization for Large Language Models with Convergence Guarantees
Viaarxiv icon

Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures

Add code
Oct 10, 2024
Figure 1 for Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures
Figure 2 for Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures
Figure 3 for Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures
Figure 4 for Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures
Viaarxiv icon

S$^3$Attention: Improving Long Sequence Attention with Smoothed Skeleton Sketching

Add code
Aug 16, 2024
Viaarxiv icon

QPT V2: Masked Image Modeling Advances Visual Scoring

Add code
Jul 23, 2024
Figure 1 for QPT V2: Masked Image Modeling Advances Visual Scoring
Figure 2 for QPT V2: Masked Image Modeling Advances Visual Scoring
Figure 3 for QPT V2: Masked Image Modeling Advances Visual Scoring
Figure 4 for QPT V2: Masked Image Modeling Advances Visual Scoring
Viaarxiv icon

On the Trade-off between Flatness and Optimization in Distributed Learning

Add code
Jun 28, 2024
Figure 1 for On the Trade-off between Flatness and Optimization in Distributed Learning
Figure 2 for On the Trade-off between Flatness and Optimization in Distributed Learning
Figure 3 for On the Trade-off between Flatness and Optimization in Distributed Learning
Figure 4 for On the Trade-off between Flatness and Optimization in Distributed Learning
Viaarxiv icon

PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild

Add code
May 28, 2024
Figure 1 for PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild
Figure 2 for PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild
Figure 3 for PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild
Figure 4 for PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild
Viaarxiv icon