Deep reinforcement learning (DRL) has led to a wide range of advances in sequential decision-making tasks. However, the complexity of neural network policies makes it difficult to understand and deploy with limited computational resources. Currently, employing compact symbolic expressions as symbolic policies is a promising strategy to obtain simple and interpretable policies. Previous symbolic policy methods usually involve complex training processes and pre-trained neural network policies, which are inefficient and limit the application of symbolic policies. In this paper, we propose an efficient gradient-based learning method named Efficient Symbolic Policy Learning (ESPL) that learns the symbolic policy from scratch in an end-to-end way. We introduce a symbolic network as the search space and employ a path selector to find the compact symbolic policy. By doing so we represent the policy with a differentiable symbolic expression and train it in an off-policy manner which further improves the efficiency. In addition, in contrast with previous symbolic policies which only work in single-task RL because of complexity, we expand ESPL on meta-RL to generate symbolic policies for unseen tasks. Experimentally, we show that our approach generates symbolic policies with higher performance and greatly improves data efficiency for single-task RL. In meta-RL, we demonstrate that compared with neural network policies the proposed symbolic policy achieves higher performance and efficiency and shows the potential to be interpretable.
The automatic extraction of biomedical entities and their interaction from unstructured data remains a challenging task due to the limited availability of expert-labeled standard datasets. In this paper, we introduce PETAI-LOR, a retrieval-based language framework that is augmented by tailored chunk scorer. Unlike previous retrieval-augmented language models (LM) that retrieve relevant documents by calculating the similarity between the input sentence and the candidate document set, PETAILOR segments the sentence into chunks and retrieves the relevant chunk from our pre-computed chunk-based relational key-value memory. Moreover, in order to comprehend the specific requirements of the LM, PETAI-LOR adapt the tailored chunk scorer to the LM. We also introduce GM-CIHT, an expert annotated biomedical triple extraction dataset with more relation types. This dataset is centered on the non-drug treatment and general biomedical domain. Additionally, we investigate the efficacy of triple extraction models trained on general domains when applied to the biomedical domain. Our experiments reveal that PETAI-LOR achieves state-of-the-art performance on GM-CIHT
Reinforcement learning (RL) has emerged as a powerful approach for tackling complex medical decision-making problems such as treatment planning, personalized medicine, and optimizing the scheduling of surgeries and appointments. It has gained significant attention in the field of Natural Language Processing (NLP) due to its ability to learn optimal strategies for tasks such as dialogue systems, machine translation, and question-answering. This paper presents a review of the RL techniques in NLP, highlighting key advancements, challenges, and applications in healthcare. The review begins by visualizing a roadmap of machine learning and its applications in healthcare. And then it explores the integration of RL with NLP tasks. We examined dialogue systems where RL enables the learning of conversational strategies, RL-based machine translation models, question-answering systems, text summarization, and information extraction. Additionally, ethical considerations and biases in RL-NLP systems are addressed.
In this paper, we explore the application of large language models (LLMs) for generating code-tracing questions in introductory programming courses. We designed targeted prompts for GPT4, guiding it to generate code-tracing questions based on code snippets and descriptions. We established a set of human evaluation metrics to assess the quality of questions produced by the model compared to those created by human experts. Our analysis provides insights into the capabilities and potential of LLMs in generating diverse code-tracing questions. Additionally, we present a unique dataset of human and LLM-generated tracing questions, serving as a valuable resource for both the education and NLP research communities. This work contributes to the ongoing dialogue on the potential uses of LLMs in educational settings.
Extremely large-scale multiple-input multiple-output (XL-MIMO) is a promising technology for the sixth-generation (6G) mobile communication networks. By significantly boosting the antenna number or size to at least an order of magnitude beyond current massive MIMO systems, XL-MIMO is expected to unprecedentedly enhance the spectral efficiency and spatial resolution for wireless communication. The evolution from massive MIMO to XL-MIMO is not simply an increase in the array size, but faces new design challenges, in terms of near-field channel modelling, performance analysis, channel estimation, and practical implementation. In this article, we give a comprehensive tutorial overview on near-field XL-MIMO communications, aiming to provide useful guidance for tackling the above challenges. First, the basic near-field modelling for XL-MIMO is established, by considering the new characteristics of non-uniform spherical wave (NUSW) and spatial non-stationarity. Next, based on the near-field modelling, the performance analysis of XL-MIMO is presented, including the near-field signal-to-noise ratio (SNR) scaling laws, beam focusing pattern, achievable rate, and degrees-of-freedom (DoF). Furthermore, various XL-MIMO design issues such as near-field beam codebook, beam training, channel estimation, and delay alignment modulation (DAM) transmission are elaborated. Finally, we point out promising directions to inspire future research on near-field XL-MIMO communications.
Intelligent reflecting surface (IRS) can bring significant performance enhancement for wireless communication systems by reconfiguring wireless channels via passive signal reflection. However, such performance improvement generally relies on the knowledge of channel state information (CSI) for IRS-associated links. Prior IRS channel estimation strategies mainly estimate IRS-cascaded channels based on the excessive pilot signals received at the users/base station (BS) with time-varying IRS reflections, which, however, are not compatible with the existing channel training/estimation protocol for cellular networks. To address this issue, we propose in this paper a new channel estimation scheme for IRS-assisted communication systems based on the received signal power measured at the user, which is practically attainable without the need of changing the current protocol. Specifically, due to the lack of signal phase information in power measurements, the autocorrelation matrix of the BS-IRS-user cascaded channel is estimated by solving equivalent matrix-rank-minimization problems. Simulation results are provided to verify the effectiveness of the proposed channel estimation algorithm as well as the IRS passive reflection design based on the estimated channel autocorrelation matrix.
With the extremely large-scale array XL-array deployed in future wireless systems, wireless communication and sensing are expected to operate in the radiative near-field region, which needs to be characterized by the spherical rather than planar wavefronts. Unlike most existing works that considered far-field integrated sensing and communication (ISAC), we study in this article the new near-field ISAC, which integrates both functions of sensing and communication in the near-field region. To this end, we first discuss the appealing advantages of near-field communication and sensing over their far-field counterparts, respectively. Then, we introduce three approaches for near-field ISAC, including joint near-field communication and sensing, sensing-assisted near-field communication, and communication-assisted near-field sensing. We discuss their individual research opportunities, new design issues, as well as propose promising solutions. Finally, several important directions in near-field ISAC are also highlighted to motivate future work.
In this paper, we present Hermes, an end-to-end framework to automatically generate formal representations from natural language cellular specifications. We first develop a neural constituency parser, NEUTREX, to process transition-relevant texts and extract transition components (i.e., states, conditions, and actions). We also design a domain-specific language to translate these transition components to logical formulas by leveraging dependency parse trees. Finally, we compile these logical formulas to generate transitions and create the formal model as finite state machines. To demonstrate the effectiveness of Hermes, we evaluate it on 4G NAS, 5G NAS, and 5G RRC specifications and obtain an overall accuracy of 81-87%, which is a substantial improvement over the state-of-the-art. Our security analysis of the extracted models uncovers 3 new vulnerabilities and identifies 19 previous attacks in 4G and 5G specifications, and 7 deviations in commercial 4G basebands.
In this paper, we propose and study a multi-functional reconfigurable intelligent surface (MF-RIS) architecture. In contrast to conventional single-functional RIS (SF-RIS) that only reflects signals, the proposed MF-RIS simultaneously supports multiple functions with one surface, including reflection, refraction, amplification, and energy harvesting of wireless signals. As such, the proposed MF-RIS is capable of significantly enhancing RIS signal coverage by amplifying the signal reflected/refracted by the RIS with the energy harvested. We present the signal model of the proposed MF-RIS, and formulate an optimization problem to maximize the sum-rate of multiple users in an MF-RIS-aided non-orthogonal multiple access network. We jointly optimize the transmit beamforming, power allocations as well as the operating modes and parameters for different elements of the MF-RIS and its deployment location, via an efficient iterative algorithm. Simulation results are provided which show significant performance gains of the MF-RIS over SF-RISs with only some of its functions available. Moreover, we demonstrate that there exists a fundamental trade-off between sum-rate maximization and harvested energy maximization. In contrast to SF-RISs which can be deployed near either the transmitter or receiver, the proposed MF-RIS should be deployed closer to the transmitter for maximizing its communication throughput with more energy harvested.