Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rui Wang

Jack

Trajectory Volatility for Out-of-Distribution Detection in Mathematical Reasoning

May 22, 2024

Yiming Wang, Pei Zhang, Baosong Yang, Derek F. Wong, Zhuosheng Zhang, Rui Wang

Abstract:Real-world data deviating from the independent and identically distributed (i.i.d.) assumption of in-distribution training data poses security threats to deep networks, thus advancing out-of-distribution (OOD) detection algorithms. Detection methods in generative language models (GLMs) mainly focus on uncertainty estimation and embedding distance measurement, with the latter proven to be most effective in traditional linguistic tasks like summarization and translation. However, another complex generative scenario mathematical reasoning poses significant challenges to embedding-based methods due to its high-density feature of output spaces, but this feature causes larger discrepancies in the embedding shift trajectory between different samples in latent spaces. Hence, we propose a trajectory-based method TV score, which uses trajectory volatility for OOD detection in mathematical reasoning. Experiments show that our method outperforms all traditional algorithms on GLMs under mathematical reasoning scenarios and can be extended to more applications with high-density features in output spaces, such as multiple-choice questions.

* 27 pages, 6 figures, 12 tables

Via

Access Paper or Ask Questions

Ultra-Fast Adaptive Track Detection Network

May 22, 2024

Hai Ni, Rui Wang, Scarlett Liu

Figure 1 for Ultra-Fast Adaptive Track Detection Network

Figure 2 for Ultra-Fast Adaptive Track Detection Network

Figure 3 for Ultra-Fast Adaptive Track Detection Network

Figure 4 for Ultra-Fast Adaptive Track Detection Network

Abstract:Railway detection is critical for the automation of railway systems. Existing models often prioritize either speed or accuracy, but achieving both remains a challenge. To address the limitations of presetting anchor groups that struggle with varying track proportions from different camera angles, an ultra-fast adaptive track detection network is proposed in this paper. This network comprises a backbone network and two specialized branches (Horizontal Coordinate Locator and Perspective Identifier). The Perspective Identifier selects the suitable anchor group from preset anchor groups, thereby determining the row coordinates of the railway track. Subsequently, the Horizontal Coordinate Locator provides row classification results based on multiple preset anchor groups. Then, utilizing the results from the Perspective Identifier, it generates the column coordinates of the railway track. This network is evaluated on multiple datasets, with the lightweight version achieving an F1 score of 98.68% on the SRail dataset and a detection rate of up to 473 FPS. Compared to the SOTA, the proposed model is competitive in both speed and accuracy. The dataset and code are available at https://github.com/idnihai/UFATD

Via

Access Paper or Ask Questions

Floor-Plan-aided Indoor Localization: Zero-Shot Learning Framework, Data Sets, and Prototype

May 22, 2024

Haiyao Yu, Changyang She, Yunkai Hu, Geng Wang, Rui Wang, Branka Vucetic, Yonghui Li

Figure 1 for Floor-Plan-aided Indoor Localization: Zero-Shot Learning Framework, Data Sets, and Prototype

Figure 2 for Floor-Plan-aided Indoor Localization: Zero-Shot Learning Framework, Data Sets, and Prototype

Figure 3 for Floor-Plan-aided Indoor Localization: Zero-Shot Learning Framework, Data Sets, and Prototype

Figure 4 for Floor-Plan-aided Indoor Localization: Zero-Shot Learning Framework, Data Sets, and Prototype

Abstract:Machine learning has been considered a promising approach for indoor localization. Nevertheless, the sample efficiency, scalability, and generalization ability remain open issues of implementing learning-based algorithms in practical systems. In this paper, we establish a zero-shot learning framework that does not need real-world measurements in a new communication environment. Specifically, a graph neural network that is scalable to the number of access points (APs) and mobile devices (MDs) is used for obtaining coarse locations of MDs. Based on the coarse locations, the floor-plan image between an MD and an AP is exploited to improve localization accuracy in a floor-plan-aided deep neural network. To further improve the generalization ability, we develop a synthetic data generator that provides synthetic data samples in different scenarios, where real-world samples are not available. We implement the framework in a prototype that estimates the locations of MDs. Experimental results show that our zero-shot learning method can reduce localization errors by around $30$\% to $55$\% compared with three baselines from the existing literature.

Via

Access Paper or Ask Questions

Multiple-Choice Questions are Efficient and Robust LLM Evaluators

May 21, 2024

Ziyin Zhang, Lizhen Xu, Zhaokun Jiang, Hongkun Hao, Rui Wang

Figure 1 for Multiple-Choice Questions are Efficient and Robust LLM Evaluators

Figure 2 for Multiple-Choice Questions are Efficient and Robust LLM Evaluators

Figure 3 for Multiple-Choice Questions are Efficient and Robust LLM Evaluators

Figure 4 for Multiple-Choice Questions are Efficient and Robust LLM Evaluators

Abstract:We present GSM-MC and MATH-MC, two multiple-choice (MC) datasets constructed by collecting answers and incorrect predictions on GSM8K and MATH from over 50 open-source models. Through extensive experiments, we show that LLMs' performance on the MC versions of these two popular benchmarks is strongly correlated with their performance on the original versions, and is quite robust to distractor choices and option orders, while the evaluation time is reduced by a factor of up to 30. Following a similar procedure, we also introduce PythonIO, a new program output prediction MC dataset constructed from two other popular LLM evaluation benchmarks HumanEval and MBPP. Our data and code are available at https://github.com/Geralt-Targaryen/MC-Evaluation.

* data at https://github.com/Geralt-Targaryen/MC-Evaluation

Via

Access Paper or Ask Questions

When Large Language Model Meets Optimization

May 16, 2024

Sen Huang, Kaixiang Yang, Sheng Qi, Rui Wang

Figure 1 for When Large Language Model Meets Optimization

Figure 2 for When Large Language Model Meets Optimization

Figure 3 for When Large Language Model Meets Optimization

Figure 4 for When Large Language Model Meets Optimization

Abstract:Optimization algorithms and large language models (LLMs) enhance decision-making in dynamic environments by integrating artificial intelligence with traditional techniques. LLMs, with extensive domain knowledge, facilitate intelligent modeling and strategic decision-making in optimization, while optimization algorithms refine LLM architectures and output quality. This synergy offers novel approaches for advancing general AI, addressing both the computational challenges of complex problems and the application of LLMs in practical scenarios. This review outlines the progress and potential of combining LLMs with optimization algorithms, providing insights for future research directions.

Via

Access Paper or Ask Questions

Dual-Segment Clustering Strategy for Federated Learning in Heterogeneous Environments

May 15, 2024

Pengcheng Sun, Erwu Liu, Wei Ni, Kanglei Yu, Rui Wang, Abbas Jamalipour

Abstract:Federated learning (FL) is a distributed machine learning paradigm with high efficiency and low communication load, only transmitting parameters or gradients of network. However, the non-independent and identically distributed (Non-IID) data characteristic has a negative impact on this paradigm. Furthermore, the heterogeneity of communication quality will significantly affect the accuracy of parameter transmission, causing a degradation in the performance of the FL system or even preventing its convergence. This letter proposes a dual-segment clustering (DSC) strategy, which first clusters the clients according to the heterogeneous communication conditions and then performs a second clustering by the sample size and label distribution, so as to solve the problem of data and communication heterogeneity. Experimental results show that the DSC strategy proposed in this letter can improve the convergence rate of FL, and has superiority on accuracy in a heterogeneous environment compared with the classical algorithm of cluster.

Via

Access Paper or Ask Questions

Sensing-Assisted Adaptive Channel Contention for Mobile Delay-Sensitive Communications

May 10, 2024

Bojie Lv, Qianren Li, Rui Wang

Figure 1 for Sensing-Assisted Adaptive Channel Contention for Mobile Delay-Sensitive Communications

Figure 2 for Sensing-Assisted Adaptive Channel Contention for Mobile Delay-Sensitive Communications

Figure 3 for Sensing-Assisted Adaptive Channel Contention for Mobile Delay-Sensitive Communications

Figure 4 for Sensing-Assisted Adaptive Channel Contention for Mobile Delay-Sensitive Communications

Abstract:This paper proposes an adaptive channel contention mechanism to optimize the queuing performance of a distributed millimeter wave (mmWave) uplink system with the capability of environment and mobility sensing. The mobile agents determine their back-off timer parameters according to their local knowledge of the uplink queue lengths, channel quality, and future channel statistics, where the channel prediction relies on the environment and mobility sensing. The optimization of queuing performance with this adaptive channel contention mechanism is formulated as a decentralized multi-agent Markov decision process (MDP). Although the channel contention actions are determined locally at the mobile agents, the optimization of local channel contention policies of all mobile agents is conducted in a centralized manner according to the system statistics before the scheduling. In the solution, the local policies are approximated by analytical models, and the optimization of their parameters becomes a stochastic optimization problem along an adaptive Markov chain. An unbiased gradient estimation is proposed so that the local policies can be optimized efficiently via the stochastic gradient descent method. It is demonstrated by simulation that the proposed gradient estimation is significantly more efficient in optimization than the existing methods, e.g., simultaneous perturbation stochastic approximation (SPSA).

Via

Access Paper or Ask Questions

ReinWiFi: A Reinforcement-Learning-Based Framework for the Application-Layer QoS Optimization of WiFi Networks

May 06, 2024

Qianren Li, Bojie Lv, Yuncong Hong, Rui Wang

Figure 1 for ReinWiFi: A Reinforcement-Learning-Based Framework for the Application-Layer QoS Optimization of WiFi Networks

Figure 2 for ReinWiFi: A Reinforcement-Learning-Based Framework for the Application-Layer QoS Optimization of WiFi Networks

Figure 3 for ReinWiFi: A Reinforcement-Learning-Based Framework for the Application-Layer QoS Optimization of WiFi Networks

Figure 4 for ReinWiFi: A Reinforcement-Learning-Based Framework for the Application-Layer QoS Optimization of WiFi Networks

Abstract:In this paper, a reinforcement-learning-based scheduling framework is proposed and implemented to optimize the application-layer quality-of-service (QoS) of a practical wireless local area network (WLAN) suffering from unknown interference. Particularly, application-layer tasks of file delivery and delay-sensitive communication, e.g., screen projection, in a WLAN with enhanced distributed channel access (EDCA) mechanism, are jointly scheduled by adjusting the contention window sizes and application-layer throughput limitation, such that their QoS, including the throughput of file delivery and the round trip time of the delay-sensitive communication, can be optimized. Due to the unknown interference and vendor-dependent implementation of the network interface card, the relation between the scheduling policy and the system QoS is unknown. Hence, a reinforcement learning method is proposed, in which a novel Q-network is trained to map from the historical scheduling parameters and QoS observations to the current scheduling action. It is demonstrated on a testbed that the proposed framework can achieve a significantly better QoS than the conventional EDCA mechanism.

Via

Access Paper or Ask Questions

A Comprehensive Survey of Dynamic Graph Neural Networks: Models, Frameworks, Benchmarks, Experiments and Challenges

May 01, 2024

ZhengZhao Feng, Rui Wang, TianXing Wang, Mingli Song, Sai Wu, Shuibing He

Abstract:Dynamic Graph Neural Networks (GNNs) combine temporal information with GNNs to capture structural, temporal, and contextual relationships in dynamic graphs simultaneously, leading to enhanced performance in various applications. As the demand for dynamic GNNs continues to grow, numerous models and frameworks have emerged to cater to different application needs. There is a pressing need for a comprehensive survey that evaluates the performance, strengths, and limitations of various approaches in this domain. This paper aims to fill this gap by offering a thorough comparative analysis and experimental evaluation of dynamic GNNs. It covers 81 dynamic GNN models with a novel taxonomy, 12 dynamic GNN training frameworks, and commonly used benchmarks. We also conduct experimental results from testing representative nine dynamic GNN models and three frameworks on six standard graph datasets. Evaluation metrics focus on convergence accuracy, training efficiency, and GPU memory usage, enabling a thorough comparison of performance across various models and frameworks. From the analysis and evaluation results, we identify key challenges and offer principles for future research to enhance the design of models and frameworks in the dynamic GNNs field.

* Under review of PVLDB2025

Via

Access Paper or Ask Questions

PAD: Patch-Agnostic Defense against Adversarial Patch Attacks

Apr 25, 2024

Lihua Jing, Rui Wang, Wenqi Ren, Xin Dong, Cong Zou

Figure 1 for PAD: Patch-Agnostic Defense against Adversarial Patch Attacks

Figure 2 for PAD: Patch-Agnostic Defense against Adversarial Patch Attacks

Figure 3 for PAD: Patch-Agnostic Defense against Adversarial Patch Attacks

Figure 4 for PAD: Patch-Agnostic Defense against Adversarial Patch Attacks

Abstract:Adversarial patch attacks present a significant threat to real-world object detectors due to their practical feasibility. Existing defense methods, which rely on attack data or prior knowledge, struggle to effectively address a wide range of adversarial patches. In this paper, we show two inherent characteristics of adversarial patches, semantic independence and spatial heterogeneity, independent of their appearance, shape, size, quantity, and location. Semantic independence indicates that adversarial patches operate autonomously within their semantic context, while spatial heterogeneity manifests as distinct image quality of the patch area that differs from original clean image due to the independent generation process. Based on these observations, we propose PAD, a novel adversarial patch localization and removal method that does not require prior knowledge or additional training. PAD offers patch-agnostic defense against various adversarial patches, compatible with any pre-trained object detectors. Our comprehensive digital and physical experiments involving diverse patch types, such as localized noise, printable, and naturalistic patches, exhibit notable improvements over state-of-the-art works. Our code is available at https://github.com/Lihua-Jing/PAD.

* Accepted by CVPR 2024

Via

Access Paper or Ask Questions