This work reports on developing a deep inverse reinforcement learning method for legged robots terrain traversability modeling that incorporates both exteroceptive and proprioceptive sensory data. Existing works use robot-agnostic exteroceptive environmental features or handcrafted kinematic features; instead, we propose to also learn robot-specific inertial features from proprioceptive sensory data for reward approximation in a single deep neural network. Incorporating the inertial features can improve the model fidelity and provide a reward that depends on the robot's state during deployment. We train the reward network using the Maximum Entropy Deep Inverse Reinforcement Learning (MEDIRL) algorithm and propose simultaneously minimizing a trajectory ranking loss to deal with the suboptimality of legged robot demonstrations. The demonstrated trajectories are ranked by locomotion energy consumption, in order to learn an energy-aware reward function and a more energy-efficient policy than demonstration. We evaluate our method using a dataset collected by an MIT Mini-Cheetah robot and a Mini-Cheetah simulator. The code is publicly available at https://github.com/ganlumomo/minicheetah-traversability-irl.
Reconfigurable intelligent surface (RIS) has recently gained popularity as a promising solution for improving the signal transmission quality of wireless communications with less hardware cost and energy consumption. This letter offers a novel deep reinforcement learning (DRL) algorithm based on a location-aware imitation environment for the joint beamforming design in an RIS-aided mmWave multiple-input multiple-output system. Specifically, we design a neural network to imitate the transmission environment based on the geometric relationship between the user's location and the mmWave channel. Following this, a novel DRL-based method is developed that interacts with the imitation environment using the easily available location information. Finally, simulation results demonstrate that the proposed DRL-based algorithm provides more robust performance without excessive interaction overhead compared to the existing DRL-based approaches.
Road network graphs provide critical information for autonomous vehicle applications, such as motion planning on drivable areas. However, manually annotating road network graphs is inefficient and labor-intensive. Automatically detecting road network graphs could alleviate this issue, but existing works are either segmentation-based approaches that could not ensure satisfactory topology correctness, or graph-based approaches that could not present precise enough detection results. To provide a solution to these problems, we propose a novel approach based on transformer and imitation learning named RNGDet (\underline{R}oad \underline{N}etwork \underline{G}raph \underline{Det}ection by Transformer) in this paper. In view of that high-resolution aerial images could be easily accessed all over the world nowadays, we make use of aerial images in our approach. Taken as input an aerial image, our approach iteratively generates road network graphs vertex-by-vertex. Our approach can handle complicated intersection points of various numbers of road segments. We evaluate our approach on a publicly available dataset. The superiority of our approach is demonstrated through the comparative experiments.
Reconfigurable intelligent surface (RIS) has become a promising technology to improve wireless communication in recent years. It steers the incident signals to create a favorable propagation environment by controlling the reconfigurable passive elements with less hardware cost and lower power consumption. In this paper, we consider a RIS-aided multiuser multiple-input single-output downlink communication system. We aim to maximize the weighted sum-rate of all users by joint optimizing the active beamforming at the access point and the passive beamforming vector of the RIS elements. Unlike most existing works, we consider the more practical situation with the discrete phase shifts and imperfect channel state information (CSI). Specifically, for the situation that the discrete phase shifts and perfect CSI are considered, we first develop a deep quantization neural network (DQNN) to simultaneously design the active and passive beamforming while most reported works design them alternatively. Then, we propose an improved structure (I-DQNN) based on DQNN to simplify the parameters decision process when the control bits of each RIS element are greater than 1 bit. Finally, we extend the two proposed DQNN-based algorithms to the case that the discrete phase shifts and imperfect CSI are considered simultaneously. Our simulation results show that the two DQNN-based algorithms have better performance than traditional algorithms in the perfect CSI case, and are also more robust in the imperfect CSI case.
High-Definition (HD) maps can provide precise geometric and semantic information of static traffic environments for autonomous driving. Road-boundary is one of the most important information contained in HD maps since it distinguishes between road areas and off-road areas, which can guide vehicles to drive within road areas. But it is labor-intensive to annotate road boundaries for HD maps at the city scale. To enable automatic HD map annotation, current work uses semantic segmentation or iterative graph growing for road-boundary detection. However, the former could not ensure topological correctness since it works at the pixel level, while the latter suffers from inefficiency and drifting issues. To provide a solution to the aforementioned problems, in this letter, we propose a novel system termed csBoundary to automatically detect road boundaries at the city scale for HD map annotation. Our network takes as input an aerial image patch, and directly infers the continuous road-boundary graph (i.e., vertices and edges) from this image. To generate the city-scale road-boundary graph, we stitch the obtained graphs from all the image patches. Our csBoundary is evaluated and compared on a public benchmark dataset. The results demonstrate our superiority. The accompanied demonstration video is available at our project page \url{https://sites.google.com/view/csboundary/}.
To mitigate the effects of shadow fading and obstacle blocking, reconfigurable intelligent surface (RIS) has become a promising technology to improve the signal transmission quality of wireless communications by controlling the reconfigurable passive elements with less hardware cost and lower power consumption. However, accurate, low-latency and low-pilot-overhead channel state information (CSI) acquisition remains a considerable challenge in RIS-assisted systems due to the large number of RIS passive elements. In this paper, we propose a three-stage joint channel decomposition and prediction framework to require CSI. The proposed framework exploits the two-timescale property that the base station (BS)-RIS channel is quasi-static and the RIS-user equipment (UE) channel is fast time-varying. Specifically, in the first stage, we use the full-duplex technique to estimate the channel between a BS's specific antenna and the RIS, addressing the critical scaling ambiguity problem in the channel decomposition. We then design a novel deep neural network, namely, the sparse-connected long short-term memory (SCLSTM), and propose a SCLSTM-based algorithm in the second and third stages, respectively. The algorithm can simultaneously decompose the BS-RIS channel and RIS-UE channel from the cascaded channel and capture the temporal relationship of the RIS-UE channel for prediction. Simulation results show that our proposed framework has lower pilot overhead than the traditional channel estimation algorithms, and the proposed SCLSTM-based algorithm can also achieve more accurate CSI acquisition robustly and effectively.
Intelligent reflecting surface (IRS) is a promising technology that enables the precise control of the electromagnetic environment in future wireless communication networks. To leverage the IRS effectively, the acquisition of channel state information (CSI) is crucial in IRS-assisted communication systems, which, however, is challenging. In this paper, we propose the optimal pilot power allocation strategy for the channel estimation of IRS-assisted communication systems, which is capable of further improving the achievable rate performance with imperfect CSI. More specifically, first of all, we introduce a multi-IRS-assisted communication system in the face of practical channel estimation errors. Furthermore, the ergodic capacity with imperfect CSI is derived in an explicit closed-form expression under the single-input single-output (SISO) consideration. Secondly, we formulate the optimization problem of maximizing the ergodic capacity with imperfect CSI, subject to the constraint of the average uplink pilot power. Thirdly, the method of Lagrange multipliers is invoked to solve the ergodic rate maximizing problem and thus to obtain the optimal pilot power allocation strategy. The resultant pilot power allocation solution suggests allocating more amount of power to the pilots for estimating the weak reflection channels. Besides, we also elaborate on the expense of the proposed pilot power allocation strategy upon analyzing the peak-to-average-power ratio (PAPR) increase quantitatively. Finally, the extensive simulation results verify our analysis and reveal some interesting results. For example, for the user in the vicinity of a large IRS, it is suggested to switch off other IRSs and only switch on the IRS nearest the user; For the user near a small IRS, it is better to switch on all IRSs and perform the optimal pilot power allocation for enhancing the achievable rate performance.
Reconfigurable intelligent surface (RIS) is a promising technology for establishing spectral- and energy-efficient wireless networks. In this paper, we study RIS-enhanced orthogonal frequency division multiplexing (OFDM) communications, which generalize the existing RIS-driven context focusing only on frequency-flat channels. Firstly, we introduce the delay adjustable metasurface (DAM) relying on varactor diodes. In contrast to existing reflecting elements, each one in DAM is capable of storing and retrieving the impinging electromagnetic waves upon dynamically controlling its electromagnetically induced transparency (EIT) properties, thus additionally imposing an extra delay onto the reflected incident signals. Secondly, we formulate the rate-maximization problem by jointly optimizing the transmit power allocation and the RIS reflection coefficients as well as the RIS delays. Furthermore, to address the coupling among optimization variables, we propose an efficient algorithm to achieve a high-quality solution for the formulated non-convex design problem by alternately optimizing the transmit power allocation and the RIS reflection pattern, including the reflection coefficients and the delays. Thirdly, to circumvent the high complexity for optimizing the RIS reflection coefficients, we conceive a low-complexity scheme upon aligning the strongest taps of all reflected channels, while ensuring that the maximum delay spread after introducing extra RIS delays does not exceed the length of the cyclic prefix (CP). Finally, simulation results demonstrate that the proposed design significantly improves the OFDM rate performance as well as the RIS's adaptability to wideband signals compared to baseline schemes without employing DAM.
This paper reports on a dynamic semantic mapping framework that incorporates 3D scene flow measurements into a closed-form Bayesian inference model. Existence of dynamic objects in the environment cause artifacts and traces in current mapping algorithms, leading to an inconsistent map posterior. We leverage state-of-the-art semantic segmentation and 3D flow estimation using deep learning to provide measurements for map inference. We develop a continuous (i.e., can be queried at arbitrary resolution) Bayesian model that propagates the scene with flow and infers a 3D semantic occupancy map with better performance than its static counterpart. Experimental results using publicly available data sets show that the proposed framework generalizes its predecessors and improves over direct measurements from deep neural networks consistently.
Low-complexity improved-throughput generalised spatial modulation (LCIT-GSM) is proposed. More explicitly, in GSM, extra information bits are conveyed implicitly by activating a fixed number $N_{a}$ out of $N_{t}$ transmit antennas (TAs) at a time. As a result, GSM has the advantage of a reduced number of radio-frequency (RF) chains and reduced inter-antenna interference (IAI) at the cost of a lower throughput than its multiplexing-oriented full-RF based counterparts. Variable-${N_a}$ GSM mitigates this throughput reduction by incorporating all possible TA activation patterns associated with a variable value $N_{a}$ ranging from $1$ to $N_{t}$ during a single channel-use, which maximises the throughput of GSM but suffers a high complexity of the mapping book design and demodulation. In order to mitigate the complexity, \emph{first of all}, we propose two efficient schemes for mapping the information bits to the TA activation patterns, which can be readily scaled to massive MIMO setups. \emph{Secondly}, in the absence of IAI, we derive a pair of low-complexity near-optimal detectors, one of them has a reduced search scope, while the other benefits from a decoupled single-stream based signal detection algorithm. \emph{Finally}, the performance of the proposed LCIT-GSM system is characterised by the error probability upper bound (UB). Our Monte Carlo based simulation results confirm the improved error performance of our proposed scheme, despite its reduced signal detection complexity.