Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gang Wang

the State Key Lab of Intelligent Control and Decision of Complex Systems and the School of Automation, Beijing Institute of Technology, Beijing, China, Beijing Institute of Technology Chongqing Innovation Center, Chongqing, China

GRASP: GRAph-Structured Pyramidal Whole Slide Image Representation

Feb 06, 2024

Ali Khajegili Mirabadi, Graham Archibald, Amirali Darbandsari, Alberto Contreras-Sanz, Ramin Ebrahim Nakhli, Maryam Asadi, Allen Zhang, C. Blake Gilks, Peter Black, Gang Wang(+2 more)

Figure 1 for GRASP: GRAph-Structured Pyramidal Whole Slide Image Representation

Figure 2 for GRASP: GRAph-Structured Pyramidal Whole Slide Image Representation

Figure 3 for GRASP: GRAph-Structured Pyramidal Whole Slide Image Representation

Figure 4 for GRASP: GRAph-Structured Pyramidal Whole Slide Image Representation

Abstract:Cancer subtyping is one of the most challenging tasks in digital pathology, where Multiple Instance Learning (MIL) by processing gigapixel whole slide images (WSIs) has been in the spotlight of recent research. However, MIL approaches do not take advantage of inter- and intra-magnification information contained in WSIs. In this work, we present GRASP, a novel graph-structured multi-magnification framework for processing WSIs in digital pathology. Our approach is designed to dynamically emulate the pathologist's behavior in handling WSIs and benefits from the hierarchical structure of WSIs. GRASP, which introduces a convergence-based node aggregation instead of traditional pooling mechanisms, outperforms state-of-the-art methods over two distinct cancer datasets by a margin of up to 10% balanced accuracy, while being 7 times smaller than the closest-performing state-of-the-art model in terms of the number of parameters. Our results show that GRASP is dynamic in finding and consulting with different magnifications for subtyping cancers and is reliable and stable across different hyperparameters. The model's behavior has been evaluated by two expert pathologists confirming the interpretability of the model's dynamic. We also provide a theoretical foundation, along with empirical evidence, for our work, explaining how GRASP interacts with different magnifications and nodes in the graph to make predictions. We believe that the strong characteristics yet simple structure of GRASP will encourage the development of interpretable, structure-based designs for WSI representation in digital pathology. Furthermore, we publish two large graph datasets of rare Ovarian and Bladder cancers to contribute to the field.

* Early version: To be updated

Via

Access Paper or Ask Questions

Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective

Feb 01, 2024

Qun Ma, Xiao Xue, Deyu Zhou, Xiangning Yu, Donghua Liu, Xuwen Zhang, Zihan Zhao, Yifan Shen, Peilin Ji, Juanjuan Li(+2 more)

Figure 1 for Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective

Figure 2 for Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective

Figure 3 for Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective

Figure 4 for Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective

Abstract:Computational experiments have emerged as a valuable method for studying complex systems, involving the algorithmization of counterfactuals. However, accurately representing real social systems in Agent-based Modeling (ABM) is challenging due to the diverse and intricate characteristics of humans, including bounded rationality and heterogeneity. To address this limitation, the integration of Large Language Models (LLMs) has been proposed, enabling agents to possess anthropomorphic abilities such as complex reasoning and autonomous learning. These agents, known as LLM-based Agent, offer the potential to enhance the anthropomorphism lacking in ABM. Nonetheless, the absence of explicit explainability in LLMs significantly hinders their application in the social sciences. Conversely, computational experiments excel in providing causal analysis of individual behaviors and complex phenomena. Thus, combining computational experiments with LLM-based Agent holds substantial research potential. This paper aims to present a comprehensive exploration of this fusion. Primarily, it outlines the historical development of agent structures and their evolution into artificial societies, emphasizing their importance in computational experiments. Then it elucidates the advantages that computational experiments and LLM-based Agents offer each other, considering the perspectives of LLM-based Agent for computational experiments and vice versa. Finally, this paper addresses the challenges and future trends in this research domain, offering guidance for subsequent related studies.

Via

Access Paper or Ask Questions

Learning Hybrid Policies for MPC with Application to Drone Flight in Unknown Dynamic Environments

Jan 25, 2024

Zhaohan Feng, Jie Chen, Wei Xiao, Jian Sun, Bin Xin, Gang Wang

Abstract:In recent years, drones have found increased applications in a wide array of real-world tasks. Model predictive control (MPC) has emerged as a practical method for drone flight control, owing to its robustness against modeling errors/uncertainties and external disturbances. However, MPC's sensitivity to manually tuned parameters can lead to rapid performance degradation when faced with unknown environmental dynamics. This paper addresses the challenge of controlling a drone as it traverses a swinging gate characterized by unknown dynamics. This paper introduces a parameterized MPC approach named hyMPC that leverages high-level decision variables to adapt to uncertain environmental conditions. To derive these decision variables, a novel policy search framework aimed at training a high-level Gaussian policy is presented. Subsequently, we harness the power of neural network policies, trained on data gathered through the repeated execution of the Gaussian policy, to provide real-time decision variables. The effectiveness of hyMPC is validated through numerical simulations, achieving a 100\% success rate in 20 drone flight tests traversing a swinging gate, demonstrating its capability to achieve safe and precise flight with limited prior knowledge of environmental dynamics.

* To be published in Unmanned Systems

Via

Access Paper or Ask Questions

DDistill-SR: Reparameterized Dynamic Distillation Network for Lightweight Image Super-Resolution

Dec 22, 2023

Yan Wang, Tongtong Su, Yusen Li, Jiuwen Cao, Gang Wang, Xiaoguang Liu

Figure 1 for DDistill-SR: Reparameterized Dynamic Distillation Network for Lightweight Image Super-Resolution

Figure 2 for DDistill-SR: Reparameterized Dynamic Distillation Network for Lightweight Image Super-Resolution

Figure 3 for DDistill-SR: Reparameterized Dynamic Distillation Network for Lightweight Image Super-Resolution

Figure 4 for DDistill-SR: Reparameterized Dynamic Distillation Network for Lightweight Image Super-Resolution

Abstract:Recent research on deep convolutional neural networks (CNNs) has provided a significant performance boost on efficient super-resolution (SR) tasks by trading off the performance and applicability. However, most existing methods focus on subtracting feature processing consumption to reduce the parameters and calculations without refining the immediate features, which leads to inadequate information in the restoration. In this paper, we propose a lightweight network termed DDistill-SR, which significantly improves the SR quality by capturing and reusing more helpful information in a static-dynamic feature distillation manner. Specifically, we propose a plug-in reparameterized dynamic unit (RDU) to promote the performance and inference cost trade-off. During the training phase, the RDU learns to linearly combine multiple reparameterizable blocks by analyzing varied input statistics to enhance layer-level representation. In the inference phase, the RDU is equally converted to simple dynamic convolutions that explicitly capture robust dynamic and static feature maps. Then, the information distillation block is constructed by several RDUs to enforce hierarchical refinement and selective fusion of spatial context information. Furthermore, we propose a dynamic distillation fusion (DDF) module to enable dynamic signals aggregation and communication between hierarchical modules to further improve performance. Empirical results show that our DDistill-SR outperforms the baselines and achieves state-of-the-art results on most super-resolution domains with much fewer parameters and less computational overhead. We have released the code of DDistill-SR at https://github.com/icandle/DDistill-SR.

* IEEE Transactions on Multimedia, 25, 7222-7234 (2023)
* Accepted by IEEE Transactions on Multimedia (TMM)

Via

Access Paper or Ask Questions

FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

Dec 14, 2023

Liping Yi, Han Yu, Zhuan Shi, Gang Wang, Xiaoguang Liu

Figure 1 for FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

Figure 2 for FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

Figure 3 for FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

Figure 4 for FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

Abstract:Federated learning (FL) is a privacy-preserving collaboratively machine learning paradigm. Traditional FL requires all data owners (a.k.a. FL clients) to train the same local model. This design is not well-suited for scenarios involving data and/or system heterogeneity. Model-Heterogeneous Personalized FL (MHPFL) has emerged to address this challenge. Existing MHPFL approaches often rely on having a public dataset with the same nature of the learning task, or incur high computation and communication costs. To address these limitations, we propose the Federated Semantic Similarity Aggregation (FedSSA) approach, which splits each client's model into a heterogeneous (structure-different) feature extractor and a homogeneous (structure-same) classification header. It performs local-to-global knowledge transfer via semantic similarity-based header parameter aggregation. In addition, global-to-local knowledge transfer is achieved via an adaptive parameter stabilization strategy which fuses the seen-class parameters of historical local headers with that of the latest global header for each client. In this way, FedSSA does not rely on public datasets, while only requiring partial header parameter transmission (thereby saving costs). Theoretical analysis proves the convergence of FedSSA. Extensive experiments demonstrate that FedSSA achieves up to $3.62 \times\%$ higher accuracy, $15.54$ times higher communication efficiency, and $15.52 \times$ higher computational efficiency compared to 7 state-of-the-art MHPFL baselines.

* arXiv admin note: text overlap with arXiv:2311.06879

Via

Access Paper or Ask Questions

On a Foundation Model for Operating Systems

Dec 13, 2023

Divyanshu Saxena, Nihal Sharma, Donghyun Kim, Rohit Dwivedula, Jiayi Chen, Chenxi Yang, Sriram Ravula, Zichao Hu, Aditya Akella, Sebastian Angel(+8 more)

Figure 1 for On a Foundation Model for Operating Systems

Figure 2 for On a Foundation Model for Operating Systems

Abstract:This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in varying environments and workloads. We discuss a wide range of possibilities that then arise, from employing foundation models as policy agents to utilizing them as generators and predictors to assist traditional OS control algorithms. Our hope is that this paper spurs further research into OS foundation models and creating the next generation of operating systems for the evolving computing landscape.

* Machine Learning for Systems Workshop at 37th NeurIPS Conference, 2023, New Orleans, LA, USA

Via

Access Paper or Ask Questions

pFedES: Model Heterogeneous Personalized Federated Learning with Feature Extractor Sharing

Nov 12, 2023

Liping Yi, Han Yu, Gang Wang, Xiaoguang Liu

Figure 1 for pFedES: Model Heterogeneous Personalized Federated Learning with Feature Extractor Sharing

Figure 2 for pFedES: Model Heterogeneous Personalized Federated Learning with Feature Extractor Sharing

Figure 3 for pFedES: Model Heterogeneous Personalized Federated Learning with Feature Extractor Sharing

Figure 4 for pFedES: Model Heterogeneous Personalized Federated Learning with Feature Extractor Sharing

Abstract:As a privacy-preserving collaborative machine learning paradigm, federated learning (FL) has attracted significant interest from academia and the industry alike. To allow each data owner (a.k.a., FL clients) to train a heterogeneous and personalized local model based on its local data distribution, system resources and requirements on model structure, the field of model-heterogeneous personalized federated learning (MHPFL) has emerged. Existing MHPFL approaches either rely on the availability of a public dataset with special characteristics to facilitate knowledge transfer, incur high computation and communication costs, or face potential model leakage risks. To address these limitations, we propose a model-heterogeneous personalized Federated learning approach based on feature Extractor Sharing (pFedES). It incorporates a small homogeneous feature extractor into each client's heterogeneous local model. Clients train them via the proposed iterative learning method to enable the exchange of global generalized knowledge and local personalized knowledge. The small local homogeneous extractors produced after local training are uploaded to the FL server and for aggregation to facilitate easy knowledge sharing among clients. We theoretically prove that pFedES can converge over wall-to-wall time. Extensive experiments on two real-world datasets against six state-of-the-art methods demonstrate that pFedES builds the most accurate model, while incurring low communication and computation costs. Compared with the best-performing baseline, it achieves 1.61% higher test accuracy, while reducing communication and computation costs by 99.6% and 82.9%, respectively.

* 12 pages, 10 figures. arXiv admin note: text overlap with arXiv:2310.13283

Via

Access Paper or Ask Questions

Learning Agility and Adaptive Legged Locomotion via Curricular Hindsight Reinforcement Learning

Oct 24, 2023

Sicen Li, Yiming Pang, Panju Bai, Zhaojin Liu, Jiawei Li, Shihao Hu, Liquan Wang, Gang Wang

Figure 1 for Learning Agility and Adaptive Legged Locomotion via Curricular Hindsight Reinforcement Learning

Figure 2 for Learning Agility and Adaptive Legged Locomotion via Curricular Hindsight Reinforcement Learning

Figure 3 for Learning Agility and Adaptive Legged Locomotion via Curricular Hindsight Reinforcement Learning

Figure 4 for Learning Agility and Adaptive Legged Locomotion via Curricular Hindsight Reinforcement Learning

Abstract:Agile and adaptive maneuvers such as fall recovery, high-speed turning, and sprinting in the wild are challenging for legged systems. We propose a Curricular Hindsight Reinforcement Learning (CHRL) that learns an end-to-end tracking controller that achieves powerful agility and adaptation for the legged robot. The two key components are (I) a novel automatic curriculum strategy on task difficulty and (ii) a Hindsight Experience Replay strategy adapted to legged locomotion tasks. We demonstrated successful agile and adaptive locomotion on a real quadruped robot that performed fall recovery autonomously, coherent trotting, sustained outdoor speeds up to 3.45 m/s, and tuning speeds up to 3.2 rad/s. This system produces adaptive behaviours responding to changing situations and unexpected disturbances on natural terrains like grass and dirt.

Via

Access Paper or Ask Questions

FedLoRA: Model-Heterogeneous Personalized Federated Learning with LoRA Tuning

Oct 20, 2023

Liping Yi, Han Yu, Gang Wang, Xiaoguang Liu

Figure 1 for FedLoRA: Model-Heterogeneous Personalized Federated Learning with LoRA Tuning

Figure 2 for FedLoRA: Model-Heterogeneous Personalized Federated Learning with LoRA Tuning

Figure 3 for FedLoRA: Model-Heterogeneous Personalized Federated Learning with LoRA Tuning

Figure 4 for FedLoRA: Model-Heterogeneous Personalized Federated Learning with LoRA Tuning

Abstract:Federated learning (FL) is an emerging machine learning paradigm in which a central server coordinates multiple participants (a.k.a. FL clients) to train a model collaboratively on decentralized data with privacy protection. This paradigm constrains that all clients have to train models with the same structures (homogeneous). In practice, FL often faces statistical heterogeneity, system heterogeneity and model heterogeneity challenges. These challenging issues inspire the field of Model-Heterogeneous Personalized Federated Learning (MHPFL) which aims to train a personalized and heterogeneous local model for each FL client. Existing MHPFL approaches cannot achieve satisfactory model performance, acceptable computational overhead and efficient communication simultaneously. To bridge this gap, we propose a novel computation- and communication-efficient model-heterogeneous personalized Federated learning framework based on LoRA tuning (FedLoRA). It is designed to incorporate a homogeneous small adapter for each client's heterogeneous local model. Both models are trained following the proposed iterative training for global-local knowledge exchange. The homogeneous small local adapters are sent to the FL server to be aggregated into a global adapter. In this way, FL clients can train heterogeneous local models without incurring high computation and communication costs. We theoretically prove the non-convex convergence rate of FedLoRA. Extensive experiments on two real-world datasets demonstrate that FedLoRA outperforms six state-of-the-art baselines, beating the best approach by 1.35% in terms of test accuracy, 11.81 times computation overhead reduction and 7.41 times communication cost saving.

* 11 pages, 11 figures

Via

Access Paper or Ask Questions

ZoomTrack: Target-aware Non-uniform Resizing for Efficient Visual Tracking

Oct 16, 2023

Yutong Kou, Jin Gao, Bing Li, Gang Wang, Weiming Hu, Yizheng Wang, Liang Li

Figure 1 for ZoomTrack: Target-aware Non-uniform Resizing for Efficient Visual Tracking

Figure 2 for ZoomTrack: Target-aware Non-uniform Resizing for Efficient Visual Tracking

Figure 3 for ZoomTrack: Target-aware Non-uniform Resizing for Efficient Visual Tracking

Figure 4 for ZoomTrack: Target-aware Non-uniform Resizing for Efficient Visual Tracking

Abstract:Recently, the transformer has enabled the speed-oriented trackers to approach state-of-the-art (SOTA) performance with high-speed thanks to the smaller input size or the lighter feature extraction backbone, though they still substantially lag behind their corresponding performance-oriented versions. In this paper, we demonstrate that it is possible to narrow or even close this gap while achieving high tracking speed based on the smaller input size. To this end, we non-uniformly resize the cropped image to have a smaller input size while the resolution of the area where the target is more likely to appear is higher and vice versa. This enables us to solve the dilemma of attending to a larger visual field while retaining more raw information for the target despite a smaller input size. Our formulation for the non-uniform resizing can be efficiently solved through quadratic programming (QP) and naturally integrated into most of the crop-based local trackers. Comprehensive experiments on five challenging datasets based on two kinds of transformer trackers, \ie, OSTrack and TransT, demonstrate consistent improvements over them. In particular, applying our method to the speed-oriented version of OSTrack even outperforms its performance-oriented counterpart by 0.6% AUC on TNL2K, while running 50% faster and saving over 55% MACs. Codes and models are available at https://github.com/Kou-99/ZoomTrack.

* 19 pages, 7 figures, Accepted by NeurIPS 2023 as a Spotlight

Via

Access Paper or Ask Questions