Picture for Hideaki Iiduka

Hideaki Iiduka

Optimal Growth Schedules for Batch Size and Learning Rate in SGD that Reduce SFO Complexity

Add code
Aug 07, 2025
Viaarxiv icon

Adaptive Batch Size and Learning Rate Scheduler for Stochastic Gradient Descent Based on Minimization of Stochastic First-order Oracle Complexity

Add code
Aug 07, 2025
Viaarxiv icon

Analysis of Muon's Convergence and Critical Batch Size

Add code
Jul 02, 2025
Viaarxiv icon

Faster Convergence of Riemannian Stochastic Gradient Descent with Increasing Batch Size

Add code
Jan 30, 2025
Figure 1 for Faster Convergence of Riemannian Stochastic Gradient Descent with Increasing Batch Size
Figure 2 for Faster Convergence of Riemannian Stochastic Gradient Descent with Increasing Batch Size
Figure 3 for Faster Convergence of Riemannian Stochastic Gradient Descent with Increasing Batch Size
Figure 4 for Faster Convergence of Riemannian Stochastic Gradient Descent with Increasing Batch Size
Viaarxiv icon

Increasing Batch Size Improves Convergence of Stochastic Gradient Descent with Momentum

Add code
Jan 15, 2025
Figure 1 for Increasing Batch Size Improves Convergence of Stochastic Gradient Descent with Momentum
Figure 2 for Increasing Batch Size Improves Convergence of Stochastic Gradient Descent with Momentum
Figure 3 for Increasing Batch Size Improves Convergence of Stochastic Gradient Descent with Momentum
Figure 4 for Increasing Batch Size Improves Convergence of Stochastic Gradient Descent with Momentum
Viaarxiv icon

Explicit and Implicit Graduated Optimization in Deep Neural Networks

Add code
Dec 16, 2024
Figure 1 for Explicit and Implicit Graduated Optimization in Deep Neural Networks
Figure 2 for Explicit and Implicit Graduated Optimization in Deep Neural Networks
Figure 3 for Explicit and Implicit Graduated Optimization in Deep Neural Networks
Figure 4 for Explicit and Implicit Graduated Optimization in Deep Neural Networks
Viaarxiv icon

Scaled Conjugate Gradient Method for Nonconvex Optimization in Deep Neural Networks

Add code
Dec 16, 2024
Viaarxiv icon

Convergence of Sharpness-Aware Minimization Algorithms using Increasing Batch Size and Decaying Learning Rate

Add code
Sep 16, 2024
Viaarxiv icon

Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent

Add code
Sep 13, 2024
Figure 1 for Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Figure 2 for Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Figure 3 for Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Figure 4 for Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Viaarxiv icon

Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates

Add code
Feb 23, 2024
Figure 1 for Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
Figure 2 for Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
Figure 3 for Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
Figure 4 for Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
Viaarxiv icon