Recently, the rapid development of metasurface facilitates the growth of extremely large-scale antenna arrays, making the ultra-massive MIMO possible. In this paper, we study the codebook design and beam training for an intelligent omni-surface (IOS) aided multi-user system, where the IOS is a novel metasurface enabling simultaneous signal reflection and refraction. To deal with the near field expansion caused by the large-dimension of IOS, we design a near-far field codebook to serve users both in the near and far fields without prior knowledge of user distribution. Moreover, to fully exploit the dual functionality of the IOS, the coupling between the reflective and refractive signals is analyzed theoretically and utilized in the codebook design, thereby reducing the training overhead. On this basis, the multi-user beam training is adopted where each codeword covers multiple areas to enable all users to be trained simultaneously. Simulation results verify our theoretical analysis on the reflective-refractive coupling. Compared to the state-of-the-art schemes, the proposed scheme can improve the sum rate and throughput.
The introduction of ChatGPT has led to a significant increase in the utilization of Large Language Models (LLMs) for addressing downstream tasks. There's an increasing focus on cost-efficient training and deployment within this context. Low-cost training and deployment of LLMs represent the future development trend. This paper reviews the evolution of large language model training techniques and inference deployment technologies aligned with this emerging trend. The discussion on training includes various aspects, including data preprocessing, training architecture, pre-training tasks, parallel training, and relevant content related to model fine-tuning. On the inference side, the paper covers topics such as model compression, parallel computation, memory scheduling, and structural optimization. It also explores LLMs' utilization and provides insights into their future development.
In recent years, pre-trained large language models (LLMs) have achieved tremendous success in the field of Natural Language Processing (NLP). Prior studies have primarily focused on general and generic domains, with relatively less research on specialized LLMs in the medical field. The specialization and high accuracy requirements for diagnosis in the medical field, as well as the challenges in collecting large-scale data, have constrained the application and development of LLMs in medical scenarios. In the field of ophthalmology, clinical diagnosis mainly relies on doctors' interpretation of reports and making diagnostic decisions. In order to take advantage of LLMs to provide decision support for doctors, we collected three modalities of ophthalmic report data and fine-tuned the LLaMA2 model, successfully constructing an LLM termed the "Ophtha-LLaMA2" specifically tailored for ophthalmic disease diagnosis. Inference test results show that even with a smaller fine-tuning dataset, Ophtha-LLaMA2 performs significantly better in ophthalmic diagnosis compared to other LLMs. It demonstrates that the Ophtha-LLaMA2 exhibits satisfying accuracy and efficiency in ophthalmic disease diagnosis, making it a valuable tool for ophthalmologists to provide improved diagnostic support for patients. This research provides a useful reference for the application of LLMs in the field of ophthalmology, while showcasing the immense potential and prospects in this domain.
Radiology report generation, as a key step in medical image analysis, is critical to the quantitative analysis of clinically informed decision-making levels. However, complex and diverse radiology reports with cross-source heterogeneity pose a huge generalizability challenge to the current methods under massive data volume, mainly because the style and normativity of radiology reports are obviously distinctive among institutions, body regions inspected and radiologists. Recently, the advent of large language models (LLM) offers great potential for recognizing signs of health conditions. To resolve the above problem, we collaborate with the Second Xiangya Hospital in China and propose ChatRadio-Valuer based on the LLM, a tailored model for automatic radiology report generation that learns generalizable representations and provides a basis pattern for model adaptation in sophisticated analysts' cases. Specifically, ChatRadio-Valuer is trained based on the radiology reports from a single institution by means of supervised fine-tuning, and then adapted to disease diagnosis tasks for human multi-system evaluation (i.e., chest, abdomen, muscle-skeleton, head, and maxillofacial $\&$ neck) from six different institutions in clinical-level events. The clinical dataset utilized in this study encompasses a remarkable total of \textbf{332,673} observations. From the comprehensive results on engineering indicators, clinical efficacy and deployment cost metrics, it can be shown that ChatRadio-Valuer consistently outperforms state-of-the-art models, especially ChatGPT (GPT-3.5-Turbo) and GPT-4 et al., in terms of the diseases diagnosis from radiology reports. ChatRadio-Valuer provides an effective avenue to boost model generalization performance and alleviate the annotation workload of experts to enable the promotion of clinical AI applications in radiology reports.
Accurate deformable object manipulation (DOM) is essential for achieving autonomy in robotic surgery, where soft tissues are being displaced, stretched, and dissected. Many DOM methods can be powered by simulation, which ensures realistic deformation by adhering to the governing physical constraints and allowing for model prediction and control. However, real soft objects in robotic surgery, such as membranes and soft tissues, have complex, anisotropic physical parameters that a simulation with simple initialization from cameras may not fully capture. To use the simulation techniques in real surgical tasks, the "real-to-sim" gap needs to be properly compensated. In this work, we propose an online, adaptive parameter tuning approach for simulation optimization that (1) bridges the real-to-sim gap between a physics simulation and observations obtained 3D perceptions through estimating a residual mapping and (2) optimizes its stiffness parameters online. Our method ensures a small residual gap between the simulation and observation and improves the simulation's predictive capabilities. The effectiveness of the proposed mechanism is evaluated in the manipulation of both a thin-shell and volumetric tissue, representative of most tissue scenarios. This work contributes to the advancement of simulation-based deformable tissue manipulation and holds potential for improving surgical autonomy.
Cloth manipulation is a category of deformable object manipulation of great interest to the robotics community, from applications of automated laundry-folding and home organizing and cleaning to textiles and flexible manufacturing. Despite the desire for automated cloth manipulation, the thin-shell dynamics and under-actuation nature of cloth present significant challenges for robots to effectively interact with them. Many recent works omit explicit modeling in favor of learning-based methods that may yield control policies directly. However, these methods require large training sets that must be collected and curated. In this regard, we create a framework for differentiable modeling of cloth dynamics leveraging an Extended Position-based Dynamics (XPBD) algorithm. Together with the desired control objective, physics-aware regularization terms are designed for better results, including trajectory smoothness and elastic potential energy. In addition, safety constraints, such as avoiding obstacles, can be specified using signed distance functions (SDFs). We formulate the cloth manipulation task with safety constraints as a constrained optimization problem, which can be effectively solved by mainstream gradient-based optimizers thanks to the end-to-end differentiability of our framework. Finally, we assess the proposed framework for manipulation tasks with various safety thresholds and demonstrate the feasibility of result trajectories on a surgical robot. The effects of the regularization terms are analyzed in an additional ablation study.
Deep neural networks have been widely used in various downstream tasks, especially those safety-critical scenario such as autonomous driving, but deep networks are often threatened by adversarial samples. Such adversarial attacks can be invisible to human eyes, but can lead to DNN misclassification, and often exhibits transferability between deep learning and machine learning models and real-world achievability. Adversarial attacks can be divided into white-box attacks, for which the attacker knows the parameters and gradient of the model, and black-box attacks, for the latter, the attacker can only obtain the input and output of the model. In terms of the attacker's purpose, it can be divided into targeted attacks and non-targeted attacks, which means that the attacker wants the model to misclassify the original sample into the specified class, which is more practical, while the non-targeted attack just needs to make the model misclassify the sample. The black box setting is a scenario we will encounter in practice.
The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of these models exhibit bilingual capabilities, proficient in both English and Chinese. However, a comprehensive evaluation of these models remains to be conducted. This lack of assessment is especially apparent within the context of radiology NLP. This study seeks to bridge this gap by critically evaluating thirty two LLMs in interpreting radiology reports, a crucial component of radiology NLP. Specifically, the ability to derive impressions from radiologic findings is assessed. The outcomes of this evaluation provide key insights into the performance, strengths, and weaknesses of these LLMs, informing their practical applications within the medical domain.
Ultra-massive multiple-input multiple-output (MIMO) is one of the key enablers in the forthcoming 6G networks to provide high-speed data services by exploiting spatial diversity. In this article, we consider a new paradigm termed holographic radio for ultra-massive MIMO, where numerous tiny and inexpensive antenna elements are integrated to realize high directive gain with low hardware cost. We propose a practical way to enable holographic radio by a novel metasurface-based antenna, i.e., reconfigurable holographic surface (RHS). Specifically, RHSs incorporating densely packed tunable metamaterial elements are capable of holographic beamforming. Based on the working principle and hardware design of RHSs, we conduct full-wave analyses of RHSs and build an RHS-aided point-to-point communication platform supporting real-time data transmission. Both simulated and experimental results show that the RHS has great potential to achieve high directive gain with a limited size, thereby substantiating the feasibility of RHS-enabled holographic radio. Moreover, future research directions for RHS-enabled holographic radio are also discussed.