Abstract:Ultra-reliable low-latency communications (URLLC) operate with short packets, where finite-blocklength effects make near-maximum-likelihood (near-ML) decoding desirable but often too costly. This paper proposes a two-stage near-ML decoding framework that applies to any linear block code. In the first stage, we run a low-complexity decoder to produce a candidate codeword and a cyclic redundancy check. When this stage succeeds, we terminate immediately. When it fails, we invoke a second-stage decoder, termed multipoint code-weight sphere decoding (MP-WSD). The central idea behind {MP-WSD} is to concentrate the ML search where it matters. We pre-compute a set of low-weight codewords and use them to generate structured local perturbations of the current estimate. Starting from the first-stage output, MP-WSD iteratively explores a small Euclidean sphere of candidate codewords formed by adding selected low-weight codewords, tightening the search region as better candidates are found. This design keeps the average complexity low: at high signal-to-noise ratio, the first stage succeeds with high probability and the second stage is rarely activated; when it is activated, the search remains localized. Simulation results show that the proposed decoder attains near-ML performance for short-blocklength, low-rate codes while maintaining low decoding latency.
Abstract:Subcode-ensemble decoders improve iterative decoding by running multiple decoders in parallel over carefully chosen subcodes, increasing the likelihood that at least one decoder avoids the dominant trapping structures. Achieving strong diversity gains, however, requires constructing many subcodes that satisfy a linear covering property-yet existing approaches lack a systematic way to scale the ensemble size while preserving this property. This paper introduces hierarchical subcode ensemble decoding (HSCED), a new ensemble decoding framework that expands the number of constituent decoders while still guaranteeing linear covering. The key idea is to recursively generate subcode parity constraints in a hierarchical structure so that coverage is maintained at every level, enabling large ensembles with controlled complexity. To demonstrate its effectiveness, we apply HSCED to belief propagation (BP) decoding of polar codes, where dense parity-check matrices induce severe stopping-set effects that limit conventional BP. Simulations confirm that HSCED delivers significant block-error-rate improvements over standard BP and conventional subcode-ensemble decoding under the same decoding-latency constraint.
Abstract:The applicability of current lesion segmentation models for chest X-rays (CXRs) has been limited both by a small number of target labels and the reliance on long, detailed expert-level text inputs, creating a barrier to practical use. To address these limitations, we introduce a new paradigm: instruction-guided lesion segmentation (ILS), which is designed to segment diverse lesion types based on simple, user-friendly instructions. Under this paradigm, we construct MIMIC-ILS, the first large-scale instruction-answer dataset for CXR lesion segmentation, using our fully automated multimodal pipeline that generates annotations from chest X-ray images and their corresponding reports. MIMIC-ILS contains 1.1M instruction-answer pairs derived from 192K images and 91K unique segmentation masks, covering seven major lesion types. To empirically demonstrate its utility, we introduce ROSALIA, a vision-language model fine-tuned on MIMIC-ILS. ROSALIA can segment diverse lesions and provide textual explanations in response to user instructions. The model achieves high segmentation and textual accuracy in our newly proposed task, highlighting the effectiveness of our pipeline and the value of MIMIC-ILS as a foundational resource for pixel-level CXR lesion grounding.
Abstract:Ultra-reliable low-latency communications (URLLC) demand high-performance error-correcting codes and decoders in the finite blocklength regime. This letter introduces a novel two-stage near-maximum likelihood (near-ML) decoding framework applicable to any linear block code. Our approach first employs a low-complexity initial decoder. If this initial stage fails a cyclic redundancy check, it triggers a second stage: the proposed code-weight sphere decoding (WSD). WSD iteratively refines the codeword estimate by exploring a localized sphere of candidates constructed from pre-computed low-weight codewords. This strategy adaptively minimizes computational overhead at high signal-to-noise ratios while achieving near-ML performance, especially for low-rate codes. Extensive simulations demonstrate that our two-stage decoder provides an excellent trade-off between decoding reliability and complexity, establishing it as a promising solution for next-generation URLLC systems.
Abstract:Radiology reports convey detailed clinical observations and capture diagnostic reasoning that evolves over time. However, existing evaluation methods are limited to single-report settings and rely on coarse metrics that fail to capture fine-grained clinical semantics and temporal dependencies. We introduce LUNGUAGE,a benchmark dataset for structured radiology report generation that supports both single-report evaluation and longitudinal patient-level assessment across multiple studies. It contains 1,473 annotated chest X-ray reports, each reviewed by experts, and 80 of them contain longitudinal annotations to capture disease progression and inter-study intervals, also reviewed by experts. Using this benchmark, we develop a two-stage framework that transforms generated reports into fine-grained, schema-aligned structured representations, enabling longitudinal interpretation. We also propose LUNGUAGESCORE, an interpretable metric that compares structured outputs at the entity, relation, and attribute level while modeling temporal consistency across patient timelines. These contributions establish the first benchmark dataset, structuring framework, and evaluation metric for sequential radiology reporting, with empirical results demonstrating that LUNGUAGESCORE effectively supports structured report evaluation. The code is available at: https://github.com/SuperSupermoon/Lunguage
Abstract:Recent progress in Large Vision-Language Models (LVLMs) has enabled promising applications in medical tasks, such as report generation and visual question answering. However, existing benchmarks focus mainly on the final diagnostic answer, offering limited insight into whether models engage in clinically meaningful reasoning. To address this, we present CheXStruct and CXReasonBench, a structured pipeline and benchmark built on the publicly available MIMIC-CXR-JPG dataset. CheXStruct automatically derives a sequence of intermediate reasoning steps directly from chest X-rays, such as segmenting anatomical regions, deriving anatomical landmarks and diagnostic measurements, computing diagnostic indices, and applying clinical thresholds. CXReasonBench leverages this pipeline to evaluate whether models can perform clinically valid reasoning steps and to what extent they can learn from structured guidance, enabling fine-grained and transparent assessment of diagnostic reasoning. The benchmark comprises 18,988 QA pairs across 12 diagnostic tasks and 1,200 cases, each paired with up to 4 visual inputs, and supports multi-path, multi-stage evaluation including visual grounding via anatomical region selection and diagnostic measurements. Even the strongest of 10 evaluated LVLMs struggle with structured reasoning and generalization, often failing to link abstract knowledge with anatomically grounded visual interpretation. The code is available at https://github.com/ttumyche/CXReasonBench
Abstract:Deep polar codes are pre-transformed polar codes that employ a multi-layered polar kernel transformation strategy to enhance code performance in short blocklength regimes. However, like conventional polar codes, their block length is constrained to powers of two, as the final transformation layer uses a conventional polar kernel matrix. This paper introduces a novel rate-matching technique for deep polar codes using code extension, particularly effective when the desired code length slightly exceeds a power of two. The key idea is to exploit the layered structure of deep polar codes by concatenating polar codewords generated at each transformation layer. Based on this structure, we also develop an efficient decoding algorithm leveraging soft-output successive cancellation list decoding and provide comprehensive error probability analysis supporting our code design algorithms. Additionally, we propose a computationally efficient greedy algorithm for multi-layer configurations. Extensive simulations confirm that our approach delivers substantial coding gains over conventional rate-matching methods, especially in medium to high code-rate regimes.
Abstract:Deep polar codes, employing multi-layered polar kernel pre-transforms in series, are recently introduced variants of pre-transformed polar codes. These codes have demonstrated the ability to reduce the number of minimum weight codewords, thereby closely achieving finite-block length capacity with successive cancellation list (SCL) decoders in certain scenarios. However, when the list size of the SCL decoder is small, which is crucial for low-latency communication applications, the reduction in the number of minimum weight codewords does not necessarily improve decoding performance. To address this limitation, we propose an alternative pre-transform technique to enhance the suitability of polar codes for SCL decoders with practical list sizes. Leveraging the fact that the SCL decoding error event set can be decomposed into two exclusive error event sets, our approach applies two different types of pre-transformations, each targeting the reduction of one of the two error event sets. Extensive simulation results under various block lengths and code rates have demonstrated that our codes consistently outperform all existing state-of-the-art pre-transformed polar codes, including CRC-aided polar codes and polarization-adjusted convolutional codes, when decoded using SCL decoders with small list sizes.
Abstract:Nonlinear self-interference cancellation (SIC) is essential for full-duplex communication systems, which can offer twice the spectral efficiency of traditional half-duplex systems. The challenge of nonlinear SIC is similar to the classic problem of system identification in adaptive filter theory, whose crux lies in identifying the optimal nonlinear basis functions for a nonlinear system. This becomes especially difficult when the system input has a non-stationary distribution. In this paper, we propose a novel algorithm for nonlinear digital SIC that adaptively constructs orthonormal polynomial basis functions according to the non-stationary moments of the transmit signal. By combining these basis functions with the least mean squares (LMS) algorithm, we introduce a new SIC technique, called as the adaptive orthonormal polynomial LMS (AOP-LMS) algorithm. To reduce computational complexity for practical systems, we augment our approach with a precomputed look-up table, which maps a given modulation and coding scheme to its corresponding basis functions. Numerical simulation indicates that our proposed method surpasses existing state-of-the-art SIC algorithms in terms of convergence speed and mean squared error when the transmit signal is non-stationary, such as with adaptive modulation and coding. Experimental evaluation with a wireless testbed confirms that our proposed approach outperforms existing digital SIC algorithms.




Abstract:In this paper, we introduce a novel class of pre-transformed polar codes, termed as deep polar codes. We first present a deep polar encoder that harnesses a series of multi-layered polar transformations with varying sizes. Our approach to encoding enables a low-complexity implementation while significantly enhancing the weight distribution of the code. Moreover, our encoding method offers flexibility in rate-profiling, embracing a wide range of code rates and blocklengths. Next, we put forth a low-complexity decoding algorithm called successive cancellation list with backpropagation parity checks (SCL-BPC). This decoding algorithm leverages the parity check equations in the reverse process of the multi-layered pre-transformed encoding for SCL decoding. Additionally, we present a low-latency decoding algorithm that employs parallel-SCL decoding by treating partially pre-transformed bit patterns as additional frozen bits. Through simulations, we demonstrate that deep polar codes outperform existing pre-transformed polar codes in terms of block error rates across various code rates under short block lengths, while maintaining low encoding and decoding complexity. Furthermore, we show that concatenating deep polar codes with cyclic-redundancy-check codes can achieve the meta-converse bound of the finite block length capacity within 0.4 dB in some instances.