Alert button
Picture for Xiaodong Zeng

Xiaodong Zeng

Alert button

MultiLoRA: Democratizing LoRA for Better Multi-Task Learning

Nov 20, 2023
Yiming Wang, Yu Lin, Xiaodong Zeng, Guannan Zhang

LoRA achieves remarkable resource efficiency and comparable performance when adapting LLMs for specific tasks. Since ChatGPT demonstrated superior performance on various tasks, there has been a growing desire to adapt one model for all tasks. However, the explicit low-rank of LoRA limits the adaptation performance in complex multi-task scenarios. LoRA is dominated by a small number of top singular vectors while fine-tuning decomposes into a set of less important unitary transforms. In this paper, we propose MultiLoRA for better multi-task adaptation by reducing the dominance of top singular vectors observed in LoRA. MultiLoRA scales LoRA modules horizontally and change parameter initialization of adaptation matrices to reduce parameter dependency, thus yields more balanced unitary subspaces. We unprecedentedly construct specialized training data by mixing datasets of instruction follow, natural language understanding, world knowledge, to cover semantically and syntactically different samples. With only 2.5% of additional parameters, MultiLoRA outperforms single LoRA counterparts and fine-tuning on multiple benchmarks and model scales. Further investigation into weight update matrices of MultiLoRA exhibits reduced dependency on top singular vectors and more democratic unitary transform contributions.

Viaarxiv icon

On the Opportunities of Green Computing: A Survey

Nov 09, 2023
You Zhou, Xiujing Lin, Xiang Zhang, Maolin Wang, Gangwei Jiang, Huakang Lu, Yupeng Wu, Kai Zhang, Zhe Yang, Kehang Wang, Yongduo Sui, Fengwei Jia, Zuoli Tang, Yao Zhao, Hongxuan Zhang, Tiannuo Yang, Weibo Chen, Yunong Mao, Yi Li, De Bao, Yu Li, Hongrui Liao, Ting Liu, Jingwen Liu, Jinchi Guo, Xiangyu Zhao, Ying WEI, Hong Qian, Qi Liu, Xiang Wang, Wai Kin, Chan, Chenliang Li, Yusen Li, Shiyu Yang, Jining Yan, Chao Mou, Shuai Han, Wuxia Jin, Guannan Zhang, Xiaodong Zeng

Figure 1 for On the Opportunities of Green Computing: A Survey
Figure 2 for On the Opportunities of Green Computing: A Survey
Figure 3 for On the Opportunities of Green Computing: A Survey
Figure 4 for On the Opportunities of Green Computing: A Survey

Artificial Intelligence (AI) has achieved significant advancements in technology and research with the development over several decades, and is widely used in many areas including computing vision, natural language processing, time-series analysis, speech synthesis, etc. During the age of deep learning, especially with the arise of Large Language Models, a large majority of researchers' attention is paid on pursuing new state-of-the-art (SOTA) results, resulting in ever increasing of model size and computational complexity. The needs for high computing power brings higher carbon emission and undermines research fairness by preventing small or medium-sized research institutions and companies with limited funding in participating in research. To tackle the challenges of computing resources and environmental impact of AI, Green Computing has become a hot research topic. In this survey, we give a systematic overview of the technologies used in Green Computing. We propose the framework of Green Computing and devide it into four key components: (1) Measures of Greenness, (2) Energy-Efficient AI, (3) Energy-Efficient Computing Systems and (4) AI Use Cases for Sustainability. For each components, we discuss the research progress made and the commonly used techniques to optimize the AI efficiency. We conclude that this new research direction has the potential to address the conflicts between resource constraints and AI development. We encourage more researchers to put attention on this direction and make AI more environmental friendly.

* 113 pages, 18 figures 
Viaarxiv icon

Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning

Sep 06, 2023
Tianchi Cai, Jiyan Jiang, Wenpeng Zhang, Shiji Zhou, Xierui Song, Li Yu, Lihong Gu, Xiaodong Zeng, Jinjie Gu, Guannan Zhang

We study the budget allocation problem in online marketing campaigns that utilize previously collected offline data. We first discuss the long-term effect of optimizing marketing budget allocation decisions in the offline setting. To overcome the challenge, we propose a novel game-theoretic offline value-based reinforcement learning method using mixed policies. The proposed method reduces the need to store infinitely many policies in previous methods to only constantly many policies, which achieves nearly optimal policy efficiency, making it practical and favorable for industrial usage. We further show that this method is guaranteed to converge to the optimal policy, which cannot be achieved by previous value-based reinforcement learning methods for marketing budget allocation. Our experiments on a large-scale marketing campaign with tens-of-millions users and more than one billion budget verify the theoretical results and show that the proposed method outperforms various baseline methods. The proposed method has been successfully deployed to serve all the traffic of this marketing campaign.

* WSDM 23, Best Paper Candidate 
Viaarxiv icon

Adversarial Learning for Incentive Optimization in Mobile Payment Marketing

Dec 28, 2021
Xuanying Chen, Zhining Liu, Li Yu, Sen Li, Lihong Gu, Xiaodong Zeng, Yize Tan, Jinjie Gu

Figure 1 for Adversarial Learning for Incentive Optimization in Mobile Payment Marketing
Figure 2 for Adversarial Learning for Incentive Optimization in Mobile Payment Marketing
Figure 3 for Adversarial Learning for Incentive Optimization in Mobile Payment Marketing

Many payment platforms hold large-scale marketing campaigns, which allocate incentives to encourage users to pay through their applications. To maximize the return on investment, incentive allocations are commonly solved in a two-stage procedure. After training a response estimation model to estimate the users' mobile payment probabilities (MPP), a linear programming process is applied to obtain the optimal incentive allocation. However, the large amount of biased data in the training set, generated by the previous biased allocation policy, causes a biased estimation. This bias deteriorates the performance of the response model and misleads the linear programming process, dramatically degrading the performance of the resulting allocation policy. To overcome this obstacle, we propose a bias correction adversarial network. Our method leverages the small set of unbiased data obtained under a full-randomized allocation policy to train an unbiased model and then uses it to reduce the bias with adversarial learning. Offline and online experimental results demonstrate that our method outperforms state-of-the-art approaches and significantly improves the performance of the resulting allocation policy in a real-world marketing campaign.

* Accept by 30th ACM International Conference on Information & Knowledge Management(CIKM2021) 
Viaarxiv icon

A Policy Efficient Reduction Approach to Convex Constrained Deep Reinforcement Learning

Aug 29, 2021
Tianchi Cai, Wenpeng Zhang, Lihong Gu, Xiaodong Zeng, Jinjie Gu

Figure 1 for A Policy Efficient Reduction Approach to Convex Constrained Deep Reinforcement Learning
Figure 2 for A Policy Efficient Reduction Approach to Convex Constrained Deep Reinforcement Learning
Figure 3 for A Policy Efficient Reduction Approach to Convex Constrained Deep Reinforcement Learning
Figure 4 for A Policy Efficient Reduction Approach to Convex Constrained Deep Reinforcement Learning

Although well-established in general reinforcement learning (RL), value-based methods are rarely explored in constrained RL (CRL) for their incapability of finding policies that can randomize among multiple actions. To apply value-based methods to CRL, a recent groundbreaking line of game-theoretic approaches uses the mixed policy that randomizes among a set of carefully generated policies to converge to the desired constraint-satisfying policy. However, these approaches require storing a large set of policies, which is not policy efficient, and may incur prohibitive memory costs in constrained deep RL. To address this problem, we propose an alternative approach. Our approach first reformulates the CRL to an equivalent distance optimization problem. With a specially designed linear optimization oracle, we derive a meta-algorithm that solves it using any off-the-shelf RL algorithm and any conditional gradient (CG) type algorithm as subroutines. We then propose a new variant of the CG-type algorithm, which generalizes the minimum norm point (MNP) method. The proposed method matches the convergence rate of the existing game-theoretic approaches and achieves the worst-case optimal policy efficiency. The experiments on a navigation task show that our method reduces the memory costs by an order of magnitude, and meanwhile achieves better performance, demonstrating both its effectiveness and efficiency.

* Reinforcement Learning for Real Life (RL4RealLife) Workshop in the 38th International Conference on Machine Learning, 2021 
Viaarxiv icon

LinkLouvain: Link-Aware A/B Testing and Its Application on Online Marketing Campaign

Feb 03, 2021
Tianchi Cai, Daxi Cheng, Chen Liang, Ziqi Liu, Lihong Gu, Huizhi Xie, Zhiqiang Zhang, Xiaodong Zeng, Jinjie Gu

Figure 1 for LinkLouvain: Link-Aware A/B Testing and Its Application on Online Marketing Campaign
Figure 2 for LinkLouvain: Link-Aware A/B Testing and Its Application on Online Marketing Campaign
Figure 3 for LinkLouvain: Link-Aware A/B Testing and Its Application on Online Marketing Campaign
Figure 4 for LinkLouvain: Link-Aware A/B Testing and Its Application on Online Marketing Campaign

A lot of online marketing campaigns aim to promote user interaction. The average treatment effect (ATE) of campaign strategies need to be monitored throughout the campaign. A/B testing is usually conducted for such needs, whereas the existence of user interaction can introduce interference to normal A/B testing. With the help of link prediction, we design a network A/B testing method LinkLouvain to minimize graph interference and it gives an accurate and sound estimate of the campaign's ATE. In this paper, we analyze the network A/B testing problem under a real-world online marketing campaign, describe our proposed LinkLouvain method, and evaluate it on real-world data. Our method achieves significant performance compared with others and is deployed in the online marketing campaign.

* Accepted by the Industrial & Practitioner Track of the 26th International Conference on Database Systems for Advanced Applications (DASFAA 2021) 
Viaarxiv icon