Abstract:Computed Tomography (CT) is one of the most widely used and diagnostically information-dense imaging modalities, covering critical organs such as the heart, lungs, liver, and colon. Clinical interpretation relies on both slice-driven local features (e.g., sub-centimeter nodules, lesion boundaries) and volume-driven spatial representations (e.g., tumor infiltration, inter-organ anatomical relations). However, existing Large Vision-Language Models (LVLMs) remain fragmented in CT slice versus volumetric understanding: slice-driven LVLMs show strong generalization but lack cross-slice spatial consistency, while volume-driven LVLMs explicitly capture volumetric semantics but suffer from coarse granularity and poor compatibility with slice inputs. The absence of a unified modeling paradigm constitutes a major bottleneck for the clinical translation of medical LVLMs. We present OmniCT, a powerful unified slice-volume LVLM for CT scenarios, which makes three contributions: (i) Spatial Consistency Enhancement (SCE): volumetric slice composition combined with tri-axial positional embedding that introduces volumetric consistency, and an MoE hybrid projection enables efficient slice-volume adaptation; (ii) Organ-level Semantic Enhancement (OSE): segmentation and ROI localization explicitly align anatomical regions, emphasizing lesion- and organ-level semantics; (iii) MedEval-CT: the largest slice-volume CT dataset and hybrid benchmark integrates comprehensive metrics for unified evaluation. OmniCT consistently outperforms existing methods with a substantial margin across diverse clinical tasks and satisfies both micro-level detail sensitivity and macro-level spatial reasoning. More importantly, it establishes a new paradigm for cross-modal medical imaging understanding.
Abstract:Autonomous driving systems require robust lane perception capabilities, yet existing vision-based detection methods suffer significant performance degradation when visual sensors provide insufficient cues, such as in occluded or lane-missing scenarios. While some approaches incorporate high-definition maps as supplementary information, these solutions face challenges of high subscription costs and limited real-time performance. To address these limitations, we explore an innovative information source: traffic flow, which offers real-time capabilities without additional costs. This paper proposes a TrafficFlow-aware Lane perception Module (TFM) that effectively extracts real-time traffic flow features and seamlessly integrates them with existing lane perception algorithms. This solution originated from real-world autonomous driving conditions and was subsequently validated on open-source algorithms and datasets. Extensive experiments on four mainstream models and two public datasets (Nuscenes and OpenLaneV2) using standard evaluation metrics show that TFM consistently improves performance, achieving up to +4.1% mAP gain on the Nuscenes dataset.
Abstract:We present Heartcare Suite, a multimodal comprehensive framework for finegrained electrocardiogram (ECG) understanding. It comprises three key components: (i) Heartcare-220K, a high-quality, structured, and comprehensive multimodal ECG dataset covering essential tasks such as disease diagnosis, waveform morphology analysis, and rhythm interpretation. (ii) Heartcare-Bench, a systematic and multi-dimensional benchmark designed to evaluate diagnostic intelligence and guide the optimization of Medical Multimodal Large Language Models (Med-MLLMs) in ECG scenarios. and (iii) HeartcareGPT with a tailored tokenizer Bidirectional ECG Abstract Tokenization (Beat), which compresses raw multi-lead signals into semantically rich discrete tokens via duallevel vector quantization and query-guided bidirectional diffusion mechanism. Built upon Heartcare-220K, HeartcareGPT achieves strong generalization and SoTA performance across multiple clinically meaningful tasks. Extensive experiments demonstrate that Heartcare Suite is highly effective in advancing ECGspecific multimodal understanding and evaluation. Our project is available at https://github.com/Wznnnnn/Heartcare-Suite .




Abstract:Calibration is crucial for ensuring the performance of phased array since amplitude-phase imbalance between elements results in significant performance degradation. While amplitude-only calibration methods offer advantages when phase measurements are impractical, conventional approaches face two key challenges: they typically require high-resolution phase shifters and remain susceptible to phase errors inherent in these components. To overcome these limitations, we propose a Rotating element Harmonic Electric-field Vector (RHEV) strategy, which enables precise calibration through time modulation principles. The proposed technique functions as follows. Two 1-bit phase shifters are periodically phase-switched at the same frequency, each generating corresponding harmonics. By adjusting the relative delay between their modulation timings, the phase difference between the $+1$st harmonics produced by the two elements can be precisely controlled, utilizing the time-shift property of the Fourier transform. Furthermore, the +1st harmonic generated by sequential modulation of individual elements exhibits a linear relationship with the amplitude of the modulated element, enabling amplitude ambiguity resolution. The proposed RHEV-based calibration method generates phase shifts through relative timing delays rather than physical phase shifter adjustments, rendering it less susceptible to phase shift errors. Additionally, since the calibration process exclusively utilizes the $+1$st harmonic, which is produced solely by the modulated unit, the method demonstrates consistent performance regardless of array size. Extensive numerical simulations, practical in-channel and over-the-air (OTA) calibration experiments demonstrate the effectiveness and distinct advantages of the proposed method.