Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Time": models, code, and papers

A Reinforcement Learning Approach for Dynamic Rebalancing in Bike-Sharing System

Feb 05, 2024
Jiaqi Liang, Sanjay Dominik Jena, Defeng Liu, Andrea Lodi

Bike-Sharing Systems provide eco-friendly urban mobility, contributing to the alleviation of traffic congestion and to healthier lifestyles. Efficiently operating such systems and maintaining high customer satisfaction is challenging due to the stochastic nature of trip demand, leading to full or empty stations. Devising effective rebalancing strategies using vehicles to redistribute bikes among stations is therefore of uttermost importance for operators. As a promising alternative to classical mathematical optimization, reinforcement learning is gaining ground to solve sequential decision-making problems. This paper introduces a spatio-temporal reinforcement learning algorithm for the dynamic rebalancing problem with multiple vehicles. We first formulate the problem as a Multi-agent Markov Decision Process in a continuous time framework. This allows for independent and cooperative vehicle rebalancing, eliminating the impractical restriction of time-discretized models where vehicle departures are synchronized. A comprehensive simulator under the first-arrive-first-serve rule is then developed to facilitate the learning process by computing immediate rewards under diverse demand scenarios. To estimate the value function and learn the rebalancing policy, various Deep Q-Network configurations are tested, minimizing the lost demand. Experiments are carried out on various datasets generated from historical data, affected by both temporal and weather factors. The proposed algorithms outperform benchmarks, including a multi-period Mixed-Integer Programming model, in terms of lost demand. Once trained, it yields immediate decisions, making it suitable for real-time applications. Our work offers practical insights for operators and enriches the integration of reinforcement learning into dynamic rebalancing problems, paving the way for more intelligent and robust urban mobility solutions.

Via

Access Paper or Ask Questions

Accelerating PDE Data Generation via Differential Operator Action in Solution Space

Feb 04, 2024
Huanshuo Dong, Hong Wang, Haoyang Liu, Jian Luo, Jie Wang

Recent advancements in data-driven approaches, such as Neural Operator (NO), have demonstrated their effectiveness in reducing the solving time of Partial Differential Equations (PDEs). However, one major challenge faced by these approaches is the requirement for a large amount of high-precision training data, which needs significant computational costs during the generation process. To address this challenge, we propose a novel PDE dataset generation algorithm, namely Differential Operator Action in Solution space (DiffOAS), which speeds up the data generation process and enhances the precision of the generated data simultaneously. Specifically, DiffOAS obtains a few basic PDE solutions and then combines them to get solutions. It applies differential operators on these solutions, a process we call 'operator action', to efficiently generate precise PDE data points. Theoretical analysis shows that the time complexity of DiffOAS method is one order lower than the existing generation method. Experimental results show that DiffOAS accelerates the generation of large-scale datasets with 10,000 instances by 300 times. Even with just 5% of the generation time, NO trained on the data generated by DiffOAS exhibits comparable performance to that using the existing generation method, which highlights the efficiency of DiffOAS.

Via

Access Paper or Ask Questions

Obstacle Avoidance Deep Reinforcement Learning-Based Trajectory Planner with Robust Low-Level Control for Robotic Manipulators

Feb 06, 2024
Mehdi Heydari Shahna, Seyed Adel Alizadeh Kolagar, Jouni Mattila

In robotics, contemporary strategies are learning-based, characterized by a complex black-box nature and a lack of interpretability, which may pose challenges in ensuring stability and safety. To address these issues, we propose integrating an obstacle-free deep reinforcement learning (DRL) trajectory planner with a novel auto-tuning low- and joint-level control strategy, all while actively engaging in the learning phase through interactions with the environment. This approach circumvents the complexities associated with computations while also addressing nonrepetitive and random obstacle avoidance tasks. First, a model-free DRL agent to plan velocity-bounded and obstacle-free motion is employed for a manipulator with 'n' degrees of freedom (DoF) in task space through joint-level reasoning. This plan is then input into a robust subsystem-based adaptive controller, which produces the necessary torques, while the Cuckoo Search Optimization (CSO) algorithm enhances control gains to minimize the time required to reach, time taken to stabilize, the maximum deviation from the desired value, and persistent tracking error in the steady state. This approach guarantees that position and velocity errors exponentially converge to zero in an unfamiliar environment, despite unknown robotic manipulator modeling. Theoretical assertions are validated through the presentation of simulation outcomes.

* This work has been submitted for possible publication in the IEEE

Via

Access Paper or Ask Questions

Collaborative Deep Reinforcement Learning for Resource Optimization in Non-Terrestrial Networks

Feb 06, 2024
Yang Cao, Shao-Yu Lien, Ying-Chang Liang, Dusit Niyato, Xuemin, Shen

Non-terrestrial networks (NTNs) with low-earth orbit (LEO) satellites have been regarded as promising remedies to support global ubiquitous wireless services. Due to the rapid mobility of LEO satellite, inter-beam/satellite handovers happen frequently for a specific user equipment (UE). To tackle this issue, earth-fixed cell scenarios have been under studied, in which the LEO satellite adjusts its beam direction towards a fixed area within its dwell duration, to maintain stable transmission performance for the UE. Therefore, it is required that the LEO satellite performs real-time resource allocation, which however is unaffordable by the LEO satellite with limited computing capability. To address this issue, in this paper, we propose a two-time-scale collaborative deep reinforcement learning (DRL) scheme for beam management and resource allocation in NTNs, in which LEO satellite and UE with different control cycles update their decision-making policies through a sequential manner. Specifically, UE updates its policy subject to improving the value functions of both the agents. Furthermore, the LEO satellite only makes decisions through finite-step rollouts with a reference decision trajectory received from the UE. Simulation results show that the proposed scheme can effectively balance the throughput performance and computational complexity over traditional greedy-searching schemes.

Via

Access Paper or Ask Questions

Combination of frequency-and time-domain characteristics of the fibrillatory waves for enhanced prediction of persistent atrial fibrillation recurrence after catheter ablation

Feb 04, 2024
Pilar Escribano, Juan Rodenas, Manuel Garcia, Miguel A. Arias, Victor M. Hidalgo, Sofia Calero, Jose J. Rieta, Raul Alcaraz

Catheter ablation (CA) remains the cornerstone alternative to cardioversion for sinus rhythm (SR) restoration in patients with atrial fibrillation (AF). Unfortunately, despite the last methodological and technological advances, this procedure is not consistently effective in treating persistent AF.

* Heliyon 2024(3)

Via

Access Paper or Ask Questions

Efficient Invariant Kalman Filter for Inertial-based Odometry with Large-sample Environmental Measurements

Feb 07, 2024
Xinghan Li, Haoying Li, Guangyang Zeng, Qingcheng Zeng, Xiaoqiang Ren, Chao Yang, Junfeng Wu

A filter for inertial-based odometry is a recursive method used to estimate the pose from measurements of ego-motion and relative pose. Currently, there is no known filter that guarantees the computation of a globally optimal solution for the non-linear measurement model. In this paper, we demonstrate that an innovative filter, with the state being $SE_2(3)$ and the $\sqrt{n}$-\textit{consistent} pose as the initialization, efficiently achieves \textit{asymptotic optimality} in terms of minimum mean square error. This approach is tailored for real-time SLAM and inertial-based odometry applications. Our first contribution is that we propose an iterative filtering method based on the Gauss-Newton method on Lie groups which is numerically to solve the estimation of states from a priori and non-linear measurements. The filtering stands out due to its iterative mechanism and adaptive initialization. Second, when dealing with environmental measurements of the surroundings, we utilize a $\sqrt{n}$-consistent pose as the initial value for the update step in a single iteration. The solution is closed in form and has computational complexity $O(n)$. Third, we theoretically show that the approach can achieve asymptotic optimality in the sense of minimum mean square error from the a priori and virtual relative pose measurements (see Problem~\ref{prob:new update problem}). Finally, to validate our method, we carry out extensive numerical and experimental evaluations. Our results consistently demonstrate that our approach outperforms other state-of-the-art filter-based methods, including the iterated extended Kalman filter and the invariant extended Kalman filter, in terms of accuracy and running time.

Via

Access Paper or Ask Questions

CREAD: A Classification-Restoration Framework with Error Adaptive Discretization for Watch Time Prediction in Video Recommender Systems

Jan 15, 2024
Jie Sun, Zhaoying Ding, Xiaoshuang Chen, Qi Chen, Yincheng Wang, Kaiqiao Zhan, Ben Wang

The watch time is a significant indicator of user satisfaction in video recommender systems. However, the prediction of watch time as a target variable is often hindered by its highly imbalanced distribution with a scarcity of observations for larger target values and over-populated samples for small values. State-of-the-art watch time prediction models discretize the continuous watch time into a set of buckets in order to consider the distribution of watch time. However, it is highly uninvestigated how these discrete buckets should be created from the continuous watch time distribution, and existing discretization approaches suffer from either a large learning error or a large restoration error. To address this challenge, we propose a Classification-Restoration framework with Error-Adaptive-Discretization (CREAD) to accurately predict the watch time. The proposed framework contains a discretization module, a classification module, and a restoration module. It predicts the watch time through multiple classification problems. The discretization process is a key contribution of the CREAD framework. We theoretically analyze the impacts of the discretization on the learning error and the restoration error, and then propose the error-adaptive discretization (EAD) technique to better balance the two errors, which achieves better performance over traditional discretization approaches. We conduct detailed offline evaluations on a public dataset and an industrial dataset, both showing performance gains through the proposed approach. Moreover, We have fully launched our framework to Kwai App, an online video platform, which resulted in a significant increase in users' video watch time by 0.29% through A/B testing. These results highlight the effectiveness of the CREAD framework in watch time prediction in video recommender systems.

* 13 pages, 9 figures

Via

Access Paper or Ask Questions

An approach to automated videogame beta testing

Feb 07, 2024
Jennifer Hernández-Bécares, Luis Costero, Pedro Pablo Gómez-Martín

Videogames developed in the 1970s and 1980s were modest programs created in a couple of months by a single person, who played the roles of designer, artist and programmer. Since then, videogames have evolved to become a multi-million dollar industry. Today, AAA game development involves hundreds of people working together over several years. Management and engineering requirements have changed at the same pace. Although many of the processes have been adapted over time, this is not quite true for quality assurance tasks, which are still done mainly manually by human beta testers due to the specific peculiarities of videogames. This paper presents an approach to automate this beta testing.

* Entertainment Computing, Elsevier. 18. pp 79 to 92. (2017)

Via

Access Paper or Ask Questions

A Video-Aware FEC-Based Unequal Loss Protection System for Video Streaming over RTP

Feb 07, 2024
César Díaz, Julián Cabrera, Fernando Jaureguizar, Narciso García

A video-aware unequal loss protection (ULP) system for protecting RTP video streaming in bursty packet loss networks is proposed. Considering the relevance of the frame, the state of the channel, and the bitrate constraints of the protection bitstream, our algorithm selects in real time the most suitable frames to be protected through forward error protection (FEC) techniques. It benefits from a wise RTP encapsulation that allows working at a frame level without requiring any further process than that of parsing RTP headers. This makes our system straightforward and fast, perfectly suitable to be included in commercial video streaming servers. Simulation results show how our technique outperforms other proposed ULP schemes.

* IEEE Transactions on Consumer Electronics, vol. 57, no. 2, pp. 523-531, May 2011

Via

Access Paper or Ask Questions

Integrated Sensing and Communication Driven Digital Twin for Intelligent Machine Network

Feb 08, 2024
Zhiqing Wei, Yucong Du, Qixun Zhang, Wangjun Jiang, Yanpeng Cui, Zeyang Meng, Huici Wu, Zhiyong Feng

Intelligent machines (IMs), including industrial machines, unmanned aerial vehicles (UAVs), and unmanned vehicles, etc., could perform effective cooperation in complex environment when they form IM network. The efficient environment sensing and communication are crucial for IM network, enabling the real-time and stable control of IMs. With the emergence of integrated sensing and communication (ISAC) technology, IM network is empowered with ubiquitous sensing capabilities, which is helpful in improving the efficiency of communication and sensing with the mutual benefit of them. However, the massive amount of sensing information brings challenges for the processing, storage and application of sensing information. In this article, ISAC driven digital twin (DT) is proposed for IM network, and the architecture and enabling technologies are revealed. ISAC driven DT structurally stores the sensing information, which is further applied to optimize communication, networking and control schemes of IMs, promoting the widespread applications of IMs.

* 9 pages, 5 figures, 1 Table

Via

Access Paper or Ask Questions