Picture for Tuo Zhao

Tuo Zhao

COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs

Add code
Feb 26, 2025
Viaarxiv icon

Discriminative Finetuning of Generative Large Language Models without Reward Models and Preference Data

Add code
Feb 25, 2025
Viaarxiv icon

Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks

Add code
Oct 12, 2024
Figure 1 for Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks
Figure 2 for Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks
Figure 3 for Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks
Figure 4 for Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks
Viaarxiv icon

Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering

Add code
Sep 16, 2024
Figure 1 for Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering
Figure 2 for Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering
Figure 3 for Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering
Figure 4 for Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering
Viaarxiv icon

Robust Reinforcement Learning from Corrupted Human Feedback

Add code
Jun 21, 2024
Viaarxiv icon

RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning

Add code
Jun 16, 2024
Figure 1 for RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning
Figure 2 for RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning
Figure 3 for RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning
Figure 4 for RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning
Viaarxiv icon

Adaptive Preference Scaling for Reinforcement Learning with Human Feedback

Add code
Jun 04, 2024
Viaarxiv icon

To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO

Add code
Apr 06, 2024
Viaarxiv icon

Stochastic Constrained Decentralized Optimization for Machine Learning with Fewer Data Oracles: a Gradient Sliding Approach

Add code
Apr 03, 2024
Figure 1 for Stochastic Constrained Decentralized Optimization for Machine Learning with Fewer Data Oracles: a Gradient Sliding Approach
Figure 2 for Stochastic Constrained Decentralized Optimization for Machine Learning with Fewer Data Oracles: a Gradient Sliding Approach
Figure 3 for Stochastic Constrained Decentralized Optimization for Machine Learning with Fewer Data Oracles: a Gradient Sliding Approach
Figure 4 for Stochastic Constrained Decentralized Optimization for Machine Learning with Fewer Data Oracles: a Gradient Sliding Approach
Viaarxiv icon

GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM

Add code
Mar 11, 2024
Figure 1 for GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
Figure 2 for GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
Figure 3 for GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
Figure 4 for GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
Viaarxiv icon