Abstract:Large language models perform well on static medical examinations, yet clinical diagnosis often requires iterative evidence gathering under uncertainty. Building on prior interactive evaluation efforts, we introduce an OSCE-inspired standardized patient simulator and a controlled, reproducible benchmark for active diagnostic inquiry. Across 468 cases and 15 models in our protocol, we observe that multi-turn evidence seeking reduces diagnostic accuracy by 12.75% and lowers supporting-evidence quality by 24.36% relative to full-context evaluation; error analyses associate these drops with premature diagnostic closure and inefficient questioning. Together, these results suggest that static full-context benchmarks may overestimate performance in interactive evidence-seeking settings, motivating complementary interactive assessment for safer clinical decision support.
Abstract:SLAM is one of the biggest bottlenecks of XR devices, which have strict requirements for latency, power consumption, and user satisfaction. A solution that has been proposed and studied to meet the requirements is to offload SLAM to a remote server, which leverages computational hardware but may suffer due to incurred delays and transmission power. In this work, we propose offloading SLAM using Massive MIMO, which is attractive due to lower latencies, transmission power, and a more reliable link for multiple users. A framework for system-level analysis of latency and localisation error in multi-user offloaded XR with Massive MIMO has been proposed, and a case study with varying system-level parameters has been performed with it. The case study showed that there are important trade-offs between latency, localisation error, and device transmission power. We find that Massive MIMO is a promising technology for XR offloading, but that further evaluations including complete device power consumption are needed to get the full picture.
Abstract:Distributed massive MIMO (D-MIMO) is a promising technology for future generation wireless systems as it takes advantage of both an increased array aperture and a decentralized processing architecture and topology. In order to truly understand the possibilities and limitations of these approaches in real scenarios, practical realization of testbeds is an essential step in the technology advancement. This work presents the Lund University Large Intelligent Surface testbed -- LuLIS, that can operate up to 256 coherent radio frequency (RF) chains using 16 AMD Zynq UltraScale RFSoC ZCU216 evaluation boards acting as distributed processing nodes. Real-time processing is facilitated by acceleration and distribution of MIMO processing algorithms on the FPGA fabric of the boards. The system is easily scalable, as increasing the number of antennas is done in multiples of 16 by adding more RFSoCs, which also implies addition of another processing node. The design allows up-scaling without hardware redesign, introduction of large latencies or data transfer overhead. The testbed is flexible in terms of deployment, with options of fully distributing the nodes (as in D-MIMO) or co-locating them (as in more traditional Massive MIMO). A detailed description of the implementation of the testbed is presented and initial results are shown for an uplink (UL) transmission from four single-antenna user equipments (UEs) to 64, 128 and 256 base-station antennas.
Abstract:This paper considers a networked tracking architecture in 6G integrated sensing and communication (ISAC) systems, where multiple base stations (BSs) cooperatively transmit radio signals and process received echo signals to track multiple moving targets. Compared to the single-BS counterpart, networked tracking allows the moving targets to be associated with different BSs over time such that the wireless resources can be dynamically allocated among BSs based on target locations. However, networked tracking imposes new challenges for algorithm design and resource allocation. In this paper, we first design the networked Kalman Filter (NKF) that is suitable for multi-BS based tracking, then characterize the posterior Cramer-Rao bound (PCRB) under this NKF, and last design the beamforming vectors of all the BSs to minimize the tracking PCRB. Numerical results show that our dynamic beamforming design can properly associate the targets to the suitable BSs at various sensing blocks and reduce the tracking mean-squared error (MSE).
Abstract:High-Resolution three-dimensional (3D) radio maps (RMs) provide rich information about the radio landscape that is essential to a myriad of wireless applications in the future wireless networks. Although deep learning (DL) methods have shown their effectiveness in RM construction, existing approaches require massive high-resolution 3D RM samples in the training dataset, the acquisition of which is labor-intensive and time-consuming in practice. In this paper, our goal is to devise a data-friendly high-resolution 3D RM construction solution via training over a hybrid dataset, wherein the RMs associated with a small fraction of environment maps (EMs) are of high-resolution, while those corresponding to the majority of EMs are of low-resolution. To this end, we propose a Data-Friendly 3D Radio Map Estimator (DF-3DRME), which comprises two processing stages. Specifically, in the first stage, we leverage the abundant low-resolution 3D RM samples to train a neural network, termed the LR-Net, for predicting the low-resolution 3D RM from the input EM, which provides a coarse characterization of the spatial radio propagation. In the second stage, we employ an advanced super-resolution network, termed the SR-Net, to upscale the predicted low-resolution 3D RM to its high-resolution counterpart. Unlike the LR-Net, the SR-Net can be effectively trained with only the limited high-resolution 3D RM samples available in the hybrid dataset. Experimental results demonstrate that the proposed framework achieves compelling reconstruction performance with only 4% of the EMs in the dataset having high-resolution 3D RM labels, which significantly reduces data acquisition overhead and facilitates practical deployment.
Abstract:Imaging is a crucial sensing function that finds wide applications in environmental reconstruction, autonomous driving, etc. However, the signal processing methods for existing radio imaging techniques, such as millimeter wave (mmWave) imaging, require high-resolution range estimation enabled by Gigahertz-level or even Terahertz-level bandwidth, and cannot be applied in 6G integrated sensing and communication (ISAC) network with Megahertz-level bandwidth. This paper proposes two novel high-resolution radio imaging schemes that can work on the 6G signals with limited bandwidth - bandwidth-independent synthetic aperture radar (BI-SAR), where the movable base station (BS) revolves along the static targets by 360 degrees; as well as bandwidth-independent inverse synthetic aperture radar (BI-ISAR), where the BS is static and the targets revolve along an axis by 360 degrees. Different from conventional SAR and ISAR counterparts that rely on range estimation, our proposed imaging schemes solely utilize Doppler information to perform imaging without any range information. The main technical challenge of our schemes lies in the anisotropic scattering functions over different directions, which hinder the coherent synthesis of the backscattered signals from all directions. We design an iterative adaptive approach-based Doppler association (IAA-DA) algorithm to tackle the above issue. Moreover, we also derive the imaging resolution to characterize the reconstruction quality. Real-world experiments are provided to show the feasibility and the effectiveness of our proposed 6G imaging schemes.
Abstract:Maintaining robust and stable communication links in high-mobility scenarios is challenging for time-division duplex (TDD) reciprocity-based gigantic MIMO systems due to rapid channel variations, especially in non-line-of-sight (NLOS) conditions. This paper proposes a user equipment (UE) beamforming strategy that enables reliable links in high mobility without additional pilot overhead. The proposed strategy aligns the UE beamforming direction with the travel axis. Our analysis shows that this choice minimizes the Doppler spread of the channel, resulting in improved temporal stability. We evaluate this approach through simulations in scattering-rich environments representative of gigantic MIMO deployments. Numerical results confirm that movement-aligned UE beamforming enhances link robustness, increases achievable data rates, and reduces pilot signaling requirements, thereby lowering UE power consumption. These findings indicate that travel-axis-aligned UE beamforming is a promising method for improving reliability in future high-mobility wireless systems.
Abstract:Beyond diagonal reconfigurable intelligent surface (BD-RIS) architectures offer superior beamforming gain over conventional diagonal RISs. However, the channel estimation overhead is the main hurdle for reaping the above gain in practice. This letter addresses this issue for group-connected BDRIS aided uplink communication from multiple multi-antenna users to one multi-antenna base station (BS). We first reveal that within each BD-RIS group, the cascaded channel associated with one user antenna and one BD-RIS element is a scaled version of that associated with any other user antenna and BD-RIS element due to the common RIS-BS channel. This insight drastically reduces the dimensionality of the channel estimation problem. Building on this property, we propose an efficient two-phase channel estimation protocol. In the first phase, the reference cascaded channels for all groups are estimated in parallel based on common received signals while determining the scaling coefficients for a single reference antenna. In the second phase, the scaling coefficients for all remaining user antennas are estimated. Numerical results demonstrate that our proposed framework achieves substantially lower estimation error with fewer pilot signals compared to state-of-the-art benchmark schemes.
Abstract:We introduce Kimi K2.5, an open-source multimodal agentic model designed to advance general agentic intelligence. K2.5 emphasizes the joint optimization of text and vision so that two modalities enhance each other. This includes a series of techniques such as joint text-vision pre-training, zero-vision SFT, and joint text-vision reinforcement learning. Building on this multimodal foundation, K2.5 introduces Agent Swarm, a self-directed parallel agent orchestration framework that dynamically decomposes complex tasks into heterogeneous sub-problems and executes them concurrently. Extensive evaluations show that Kimi K2.5 achieves state-of-the-art results across various domains including coding, vision, reasoning, and agentic tasks. Agent Swarm also reduces latency by up to $4.5\times$ over single-agent baselines. We release the post-trained Kimi K2.5 model checkpoint to facilitate future research and real-world applications of agentic intelligence.
Abstract:Magnetic Resonance Imaging (MRI) provides detailed tissue information, but its clinical application is limited by long acquisition time, high cost, and restricted resolution. Image translation has recently gained attention as a strategy to address these limitations. Although Pix2Pix has been widely applied in medical image translation, its potential has not been fully explored. In this study, we propose an enhanced Pix2Pix framework that integrates Squeeze-and-Excitation Residual Networks (SEResNet) and U-Net++ to improve image generation quality and structural fidelity. SEResNet strengthens critical feature representation through channel attention, while U-Net++ enhances multi-scale feature fusion. A simplified PatchGAN discriminator further stabilizes training and refines local anatomical realism. Experimental results demonstrate that under few-shot conditions with fewer than 500 images, the proposed method achieves consistent structural fidelity and superior image quality across multiple intra-modality MRI translation tasks, showing strong generalization ability. These results suggest an effective extension of Pix2Pix for medical image translation.