Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhaoran Liu

Mixture of Low Rank Adaptation with Partial Parameter Sharing for Time Series Forecasting

May 23, 2025

Licheng Pan, Zhichao Chen, Haoxuan Li, Guangyi Liu, Zhijian Xu, Zhaoran Liu, Hao Wang, Ying Wei

Abstract:Multi-task forecasting has become the standard approach for time-series forecasting (TSF). However, we show that it suffers from an Expressiveness Bottleneck, where predictions at different time steps share the same representation, leading to unavoidable errors even with optimal representations. To address this issue, we propose a two-stage framework: first, pre-train a foundation model for one-step-ahead prediction; then, adapt it using step-specific LoRA modules.This design enables the foundation model to handle any number of forecast steps while avoiding the expressiveness bottleneck. We further introduce the Mixture-of-LoRA (MoLA) model, which employs adaptively weighted LoRA experts to achieve partial parameter sharing across steps. This approach enhances both efficiency and forecasting performance by exploiting interdependencies between forecast steps. Experiments show that MoLA significantly improves model expressiveness and outperforms state-of-the-art time-series forecasting methods. Code is available at https://anonymous.4open.science/r/MoLA-BC92.

Via

Access Paper or Ask Questions

Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

Jul 01, 2024

Hao Wang, Zhichao Chen, Yuan Shen, Jiajun Fan, Zhaoran Liu, Degui Yang, Xinggao Liu, Haoxuan Li

Figure 1 for Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

Figure 2 for Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

Figure 3 for Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

Figure 4 for Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

Abstract:Heterogeneous treatment effect (HTE) estimation from observational data poses significant challenges due to treatment selection bias. Existing methods address this bias by minimizing distribution discrepancies between treatment groups in latent space, focusing on global alignment. However, the fruitful aspect of local proximity, where similar units exhibit similar outcomes, is often overlooked. In this study, we propose Proximity-aware Counterfactual Regression (PCR) to exploit proximity for representation balancing within the HTE estimation context. Specifically, we introduce a local proximity preservation regularizer based on optimal transport to depict the local proximity in discrepancy calculation. Furthermore, to overcome the curse of dimensionality that renders the estimation of discrepancy ineffective, exacerbated by limited data availability for HTE estimation, we develop an informative subspace projector, which trades off minimal distance precision for improved sample complexity. Extensive experiments demonstrate that PCR accurately matches units across different treatment groups, effectively mitigates treatment selection bias, and significantly outperforms competitors. Code is available at https://anonymous.4open.science/status/ncr-B697.

* Code is available at https://anonymous.4open.science/status/ncr-B697

Via

Access Paper or Ask Questions

Modeling Task Relationships in Multi-variate Soft Sensor with Balanced Mixture-of-Experts

May 25, 2023

Yuxin Huang, Hao Wang, Zhaoran Liu, Licheng Pan, Haozhe Li, Xinggao Liu

Figure 1 for Modeling Task Relationships in Multi-variate Soft Sensor with Balanced Mixture-of-Experts

Figure 2 for Modeling Task Relationships in Multi-variate Soft Sensor with Balanced Mixture-of-Experts

Figure 3 for Modeling Task Relationships in Multi-variate Soft Sensor with Balanced Mixture-of-Experts

Figure 4 for Modeling Task Relationships in Multi-variate Soft Sensor with Balanced Mixture-of-Experts

Abstract:Accurate estimation of multiple quality variables is critical for building industrial soft sensor models, which have long been confronted with data efficiency and negative transfer issues. Methods sharing backbone parameters among tasks address the data efficiency issue; however, they still fail to mitigate the negative transfer problem. To address this issue, a balanced Mixture-of-Experts (BMoE) is proposed in this work, which consists of a multi-gate mixture of experts (MMoE) module and a task gradient balancing (TGB) module. The MoE module aims to portray task relationships, while the TGB module balances the gradients among tasks dynamically. Both of them cooperate to mitigate the negative transfer problem. Experiments on the typical sulfur recovery unit demonstrate that BMoE models task relationship and balances the training process effectively, and achieves better performance than baseline models significantly.

Via

Access Paper or Ask Questions

AttentionMixer: An Accurate and Interpretable Framework for Process Monitoring

Feb 21, 2023

Hao Wang, Zhiyu Wang, Yunlong Niu, Zhaoran Liu, Haozhe Li, Yilin Liao, Yuxin Huang, Xinggao Liu

Abstract:An accurate and explainable automatic monitoring system is critical for the safety of high efficiency energy conversion plants that operate under extreme working condition. Nonetheless, currently available data-driven monitoring systems often fall short in meeting the requirements for either high-accuracy or interpretability, which hinders their application in practice. To overcome this limitation, a data-driven approach, AttentionMixer, is proposed under a generalized message passing framework, with the goal of establishing an accurate and interpretable radiation monitoring framework for energy conversion plants. To improve the model accuracy, the first technical contribution involves the development of spatial and temporal adaptive message passing blocks, which enable the capture of spatial and temporal correlations, respectively; the two blocks are cascaded through a mixing operator. To enhance the model interpretability, the second technical contribution involves the implementation of a sparse message passing regularizer, which eliminates spurious and noisy message passing routes. The effectiveness of the AttentionMixer approach is validated through extensive evaluations on a monitoring benchmark collected from the national radiation monitoring network for nuclear power plants, resulting in enhanced monitoring accuracy and interpretability in practice.

Via

Access Paper or Ask Questions

Towards Relation Extraction From Speech

Oct 17, 2022

Tongtong Wu, Guitao Wang, Jinming Zhao, Zhaoran Liu, Guilin Qi, Yuan-Fang Li, Gholamreza Haffari

Figure 1 for Towards Relation Extraction From Speech

Figure 2 for Towards Relation Extraction From Speech

Figure 3 for Towards Relation Extraction From Speech

Figure 4 for Towards Relation Extraction From Speech

Abstract:Relation extraction typically aims to extract semantic relationships between entities from the unstructured text. One of the most essential data sources for relation extraction is the spoken language, such as interviews and dialogues. However, the error propagation introduced in automatic speech recognition (ASR) has been ignored in relation extraction, and the end-to-end speech-based relation extraction method has been rarely explored. In this paper, we propose a new listening information extraction task, i.e., speech relation extraction. We construct the training dataset for speech relation extraction via text-to-speech systems, and we construct the testing dataset via crowd-sourcing with native English speakers. We explore speech relation extraction via two approaches: the pipeline approach conducting text-based extraction with a pretrained ASR module, and the end2end approach via a new proposed encoder-decoder model, or what we called SpeechRE. We conduct comprehensive experiments to distinguish the challenges in speech relation extraction, which may shed light on future explorations. We share the code and data on https://github.com/wutong8023/SpeechRE.

* Accepted by EMNLP 2022

Via

Access Paper or Ask Questions

Analyze and Design Network Architectures by Recursion Formulas

Aug 18, 2021

Yilin Liao, Hao Wang, Zhaoran Liu, Haozhe Li, Xinggao Liu

Figure 1 for Analyze and Design Network Architectures by Recursion Formulas

Figure 2 for Analyze and Design Network Architectures by Recursion Formulas

Figure 3 for Analyze and Design Network Architectures by Recursion Formulas

Figure 4 for Analyze and Design Network Architectures by Recursion Formulas

Abstract:The effectiveness of shortcut/skip-connection has been widely verified, which inspires massive explorations on neural architecture design. This work attempts to find an effective way to design new network architectures. It is discovered that the main difference between network architectures can be reflected in their recursion formulas. Based on this, a methodology is proposed to design novel network architectures from the perspective of mathematical formulas. Afterwards, a case study is provided to generate an improved architecture based on ResNet. Furthermore, the new architecture is compared with ResNet and then tested on ResNet-based networks. Massive experiments are conducted on CIFAR and ImageNet, which witnesses the significant performance improvements provided by the architecture.

* It is hoped that the new network architecture is derived according to a specific purpose

Via

Access Paper or Ask Questions