Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiao Li

A Generalized Version of Chung's Lemma and its Applications

Jun 09, 2024

Li Jiang, Xiao Li, Andre Milzarek, Junwen Qiu

Figure 1 for A Generalized Version of Chung's Lemma and its Applications

Figure 2 for A Generalized Version of Chung's Lemma and its Applications

Figure 3 for A Generalized Version of Chung's Lemma and its Applications

Abstract:Chung's lemma is a classical tool for establishing asymptotic convergence rates of (stochastic) optimization methods under strong convexity-type assumptions and appropriate polynomial diminishing step sizes. In this work, we develop a generalized version of Chung's lemma, which provides a simple non-asymptotic convergence framework for a more general family of step size rules. We demonstrate broad applicability of the proposed generalized Chung's lemma by deriving tight non-asymptotic convergence rates for a large variety of stochastic methods. In particular, we obtain partially new non-asymptotic complexity results for stochastic optimization methods, such as stochastic gradient descent and random reshuffling, under a general $(\theta,\mu)$-Polyak-Lojasiewicz (PL) condition and for various step sizes strategies, including polynomial, constant, exponential, and cosine step sizes rules. Notably, as a by-product of our analysis, we observe that exponential step sizes can adapt to the objective function's geometry, achieving the optimal convergence rate without requiring exact knowledge of the underlying landscape. Our results demonstrate that the developed variant of Chung's lemma offers a versatile, systematic, and streamlined approach to establish non-asymptotic convergence rates under general step size rules.

* 43 pages, 5 figures

Via

Access Paper or Ask Questions

A KL-based Analysis Framework with Applications to Non-Descent Optimization Methods

Jun 04, 2024

Junwen Qiu, Bohao Ma, Xiao Li, Andre Milzarek

Abstract:We propose a novel analysis framework for non-descent-type optimization methodologies in nonconvex scenarios based on the Kurdyka-Lojasiewicz property. Our framework allows covering a broad class of algorithms, including those commonly employed in stochastic and distributed optimization. Specifically, it enables the analysis of first-order methods that lack a sufficient descent property and do not require access to full (deterministic) gradient information. We leverage this framework to establish, for the first time, iterate convergence and the corresponding rates for the decentralized gradient method and federated averaging under mild assumptions. Furthermore, based on the new analysis techniques, we show the convergence of the random reshuffling and stochastic gradient descent method without necessitating typical a priori bounded iterates assumptions.

* 29 pages

Via

Access Paper or Ask Questions

Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model

Jun 01, 2024

Jinyin Chen, Xiaoming Zhao, Haibin Zheng, Xiao Li, Sheng Xiang, Haifeng Guo

Figure 1 for Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model

Figure 2 for Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model

Figure 3 for Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model

Figure 4 for Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model

Abstract:Benefiting from well-trained deep neural networks (DNNs), model compression have captured special attention for computing resource limited equipment, especially edge devices. Knowledge distillation (KD) is one of the widely used compression techniques for edge deployment, by obtaining a lightweight student model from a well-trained teacher model released on public platforms. However, it has been empirically noticed that the backdoor in the teacher model will be transferred to the student model during the process of KD. Although numerous KD methods have been proposed, most of them focus on the distillation of a high-performing student model without robustness consideration. Besides, some research adopts KD techniques as effective backdoor mitigation tools, but they fail to perform model compression at the same time. Consequently, it is still an open problem to well achieve two objectives of robust KD, i.e., student model's performance and backdoor mitigation. To address these issues, we propose RobustKD, a robust knowledge distillation that compresses the model while mitigating backdoor based on feature variance. Specifically, RobustKD distinguishes the previous works in three key aspects: (1) effectiveness: by distilling the feature map of the teacher model after detoxification, the main task performance of the student model is comparable to that of the teacher model; (2) robustness: by reducing the characteristic variance between the teacher model and the student model, it mitigates the backdoor of the student model under backdoored teacher model scenario; (3) generic: RobustKD still has good performance in the face of multiple data models (e.g., WRN 28-4, Pyramid-200) and diverse DNNs (e.g., ResNet50, MobileNet).

Via

Access Paper or Ask Questions

Leveraging Generative AI for Smart City Digital Twins: A Survey on the Autonomous Generation of Data, Scenarios, 3D City Models, and Urban Designs

May 29, 2024

Haowen Xu, Femi Omitaomu, Soheil Sabri, Xiao Li, Yongze Song

Abstract:The digital transformation of modern cities by integrating advanced information, communication, and computing technologies has marked the epoch of data-driven smart city applications for efficient and sustainable urban management. Despite their effectiveness, these applications often rely on massive amounts of high-dimensional and multi-domain data for monitoring and characterizing different urban sub-systems, presenting challenges in application areas that are limited by data quality and availability, as well as costly efforts for generating urban scenarios and design alternatives. As an emerging research area in deep learning, Generative Artificial Intelligence (AI) models have demonstrated their unique values in data and code generation. This survey paper aims to explore the innovative integration of generative AI techniques and urban digital twins to address challenges in the realm of smart cities in various urban sectors, such as transportation and mobility management, energy system operations, building and infrastructure management, and urban design. The survey starts with the introduction of popular generative AI models with their application areas, followed by a structured review of the existing urban science applications that leverage the autonomous capability of the generative AI techniques to facilitate (a) data augmentation for promoting urban monitoring and predictive analytics, (b) synthetic data and scenario generation, (c) automated 3D city modeling, and (d) generative urban design and optimization. Based on the review, this survey discusses potential opportunities and technical strategies that integrate generative AI models into the next-generation urban digital twins for more reliable, scalable, and automated management of smart cities.

Via

Access Paper or Ask Questions

RaFe: Ranking Feedback Improves Query Rewriting for RAG

May 23, 2024

Shengyu Mao, Yong Jiang, Boli Chen, Xiao Li, Peng Wang, Xinyu Wang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang

Figure 1 for RaFe: Ranking Feedback Improves Query Rewriting for RAG

Figure 2 for RaFe: Ranking Feedback Improves Query Rewriting for RAG

Figure 3 for RaFe: Ranking Feedback Improves Query Rewriting for RAG

Figure 4 for RaFe: Ranking Feedback Improves Query Rewriting for RAG

Abstract:As Large Language Models (LLMs) and Retrieval Augmentation Generation (RAG) techniques have evolved, query rewriting has been widely incorporated into the RAG system for downstream tasks like open-domain QA. Many works have attempted to utilize small models with reinforcement learning rather than costly LLMs to improve query rewriting. However, current methods require annotations (e.g., labeled relevant documents or downstream answers) or predesigned rewards for feedback, which lack generalization, and fail to utilize signals tailored for query rewriting. In this paper, we propose ours, a framework for training query rewriting models free of annotations. By leveraging a publicly available reranker, ours~provides feedback aligned well with the rewriting objectives. Experimental results demonstrate that ours~can obtain better performance than baselines.

* 16 pages

Via

Access Paper or Ask Questions

Digital Health and Indoor Air Quality: An IoT-Driven Human-Centred Visualisation Platform for Behavioural Change and Technology Acceptance

May 20, 2024

Rameez Raja Kureshi, Bhupesh Kumar Mishra, Dhavalkumar Thakker, Suvodeep Mazumdar, Xiao Li

Figure 1 for Digital Health and Indoor Air Quality: An IoT-Driven Human-Centred Visualisation Platform for Behavioural Change and Technology Acceptance

Figure 2 for Digital Health and Indoor Air Quality: An IoT-Driven Human-Centred Visualisation Platform for Behavioural Change and Technology Acceptance

Figure 3 for Digital Health and Indoor Air Quality: An IoT-Driven Human-Centred Visualisation Platform for Behavioural Change and Technology Acceptance

Figure 4 for Digital Health and Indoor Air Quality: An IoT-Driven Human-Centred Visualisation Platform for Behavioural Change and Technology Acceptance

Abstract:The detrimental effects of air pollutants on human health have prompted increasing concerns regarding indoor air quality (IAQ). The emergence of digital health interventions and citizen science initiatives has provided new avenues for raising awareness, improving IAQ, and promoting behavioural changes. The Technology Acceptance Model (TAM) offers a theoretical framework to understand user acceptance and adoption of IAQ technology. This paper presents a case study using the COM-B model and Internet of Things (IoT) technology to design a human-centred digital visualisation platform, leading to behavioural changes and improved IAQ. The study also investigates users' acceptance and adoption of the technology, focusing on their experiences, expectations, and the impact on IAQ. Integrating IAQ sensing, digital health-related interventions, citizen science, and the TAM model offers opportunities to address IAQ challenges, enhance public health, and foster sustainable indoor environments. The analytical results show that factors such as human behaviour, indoor activities, and awareness play crucial roles in shaping IAQ.

* 10 pages

Via

Access Paper or Ask Questions

A Minimal Set of Parameters Based Depth-Dependent Distortion Model and Its Calibration Method for Stereo Vision Systems

May 01, 2024

Xin Ma, Puchen Zhu, Xiao Li, Xiaoyin Zheng, Jianshu Zhou, Xuchen Wang, Kwok Wai Samuel Au

Figure 1 for A Minimal Set of Parameters Based Depth-Dependent Distortion Model and Its Calibration Method for Stereo Vision Systems

Figure 2 for A Minimal Set of Parameters Based Depth-Dependent Distortion Model and Its Calibration Method for Stereo Vision Systems

Figure 3 for A Minimal Set of Parameters Based Depth-Dependent Distortion Model and Its Calibration Method for Stereo Vision Systems

Figure 4 for A Minimal Set of Parameters Based Depth-Dependent Distortion Model and Its Calibration Method for Stereo Vision Systems

Abstract:Depth position highly affects lens distortion, especially in close-range photography, which limits the measurement accuracy of existing stereo vision systems. Moreover, traditional depth-dependent distortion models and their calibration methods have remained complicated. In this work, we propose a minimal set of parameters based depth-dependent distortion model (MDM), which considers the radial and decentering distortions of the lens to improve the accuracy of stereo vision systems and simplify their calibration process. In addition, we present an easy and flexible calibration method for the MDM of stereo vision systems with a commonly used planar pattern, which requires cameras to observe the planar pattern in different orientations. The proposed technique is easy to use and flexible compared with classical calibration techniques for depth-dependent distortion models in which the lens must be perpendicular to the planar pattern. The experimental validation of the MDM and its calibration method showed that the MDM improved the calibration accuracy by 56.55% and 74.15% compared with the Li's distortion model and traditional Brown's distortion model. Besides, an iteration-based reconstruction method is proposed to iteratively estimate the depth information in the MDM during three-dimensional reconstruction. The results showed that the accuracy of the iteration-based reconstruction method was improved by 9.08% compared with that of the non-iteration reconstruction method.

* This paper has been accepted for publication in IEEE Transactions on Instrumentation and Measurement

Via

Access Paper or Ask Questions

Machine Learning Techniques for Data Reduction of Climate Applications

May 01, 2024

Xiao Li, Qian Gong, Jaemoon Lee, Scott Klasky, Anand Rangarajan, Sanjay Ranka

Abstract:Scientists conduct large-scale simulations to compute derived quantities-of-interest (QoI) from primary data. Often, QoI are linked to specific features, regions, or time intervals, such that data can be adaptively reduced without compromising the integrity of QoI. For many spatiotemporal applications, these QoI are binary in nature and represent presence or absence of a physical phenomenon. We present a pipelined compression approach that first uses neural-network-based techniques to derive regions where QoI are highly likely to be present. Then, we employ a Guaranteed Autoencoder (GAE) to compress data with differential error bounds. GAE uses QoI information to apply low-error compression to only these regions. This results in overall high compression ratios while still achieving downstream goals of simulation or data collections. Experimental results are presented for climate data generated from the E3SM Simulation model for downstream quantities such as tropical cyclone and atmospheric river detection and tracking. These results show that our approach is superior to comparable methods in the literature.

* 7 pages. arXiv admin note: text overlap with arXiv:2404.18063

Via

Access Paper or Ask Questions

Semantic Satellite Communications Based on Generative Foundation Model

Apr 18, 2024

Peiwen Jiang, Chao-Kai Wen, Xiao Li, Shi Jin, Geoffrey Ye Li

Figure 1 for Semantic Satellite Communications Based on Generative Foundation Model

Figure 2 for Semantic Satellite Communications Based on Generative Foundation Model

Figure 3 for Semantic Satellite Communications Based on Generative Foundation Model

Figure 4 for Semantic Satellite Communications Based on Generative Foundation Model

Abstract:Satellite communications can provide massive connections and seamless coverage, but they also face several challenges, such as rain attenuation, long propagation delays, and co-channel interference. To improve transmission efficiency and address severe scenarios, semantic communication has become a popular choice, particularly when equipped with foundation models (FMs). In this study, we introduce an FM-based semantic satellite communication framework, termed FMSAT. This framework leverages FM-based segmentation and reconstruction to significantly reduce bandwidth requirements and accurately recover semantic features under high noise and interference. Considering the high speed of satellites, an adaptive encoder-decoder is proposed to protect important features and avoid frequent retransmissions. Meanwhile, a well-received image can provide a reference for repairing damaged images under sudden attenuation. Since acknowledgment feedback is subject to long propagation delays when retransmission is unavoidable, a novel error detection method is proposed to roughly detect semantic errors at the regenerative satellite. With the proposed detectors at both the satellite and the gateway, the quality of the received images can be ensured. The simulation results demonstrate that the proposed method can significantly reduce bandwidth requirements, adapt to complex satellite scenarios, and protect semantic information with an acceptable transmission delay.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning

Apr 17, 2024

Xiao Li, Yong Jiang, Shen Huang, Pengjun Xie, Gong Cheng, Fei Huang

Abstract:Key Point Analysis (KPA), the summarization of multiple arguments into a concise collection of key points, continues to be a significant and unresolved issue within the field of argument mining. Existing models adapt a two-stage pipeline of clustering arguments or generating key points for argument clusters. This approach rely on semantic similarity instead of measuring the existence of shared key points among arguments. Additionally, it only models the intra-cluster relationship among arguments, disregarding the inter-cluster relationship between arguments that do not share key points. To address these limitations, we propose a novel approach for KPA with pairwise generation and graph partitioning. Our objective is to train a generative model that can simultaneously provide a score indicating the presence of shared key point between a pair of arguments and generate the shared key point. Subsequently, to map generated redundant key points to a concise set of key points, we proceed to construct an arguments graph by considering the arguments as vertices, the generated key points as edges, and the scores as edge weights. We then propose a graph partitioning algorithm to partition all arguments sharing the same key points to the same subgraph. Notably, our experimental findings demonstrate that our proposed model surpasses previous models when evaluated on both the ArgKP and QAM datasets.

* 11 pages, 4 figures, 4 tables. Accepted to NAACL 2024

Via

Access Paper or Ask Questions