Today's AI systems for medical decision support often succeed on benchmark datasets in research papers but fail in real-world deployment. This work focuses on the decision making of sepsis, an acute life-threatening systematic infection that requires an early diagnosis with high uncertainty from the clinician. Our aim is to explore the design requirements for AI systems that can support clinical experts in making better decisions for the early diagnosis of sepsis. The study begins with a formative study investigating why clinical experts abandon an existing AI-powered Sepsis predictive module in their electrical health record (EHR) system. We argue that a human-centered AI system needs to support human experts in the intermediate stages of a medical decision-making process (e.g., generating hypotheses or gathering data), instead of focusing only on the final decision. Therefore, we build SepsisLab based on a state-of-the-art AI algorithm and extend it to predict the future projection of sepsis development, visualize the prediction uncertainty, and propose actionable suggestions (i.e., which additional laboratory tests can be collected) to reduce such uncertainty. Through heuristic evaluation with six clinicians using our prototype system, we demonstrate that SepsisLab enables a promising human-AI collaboration paradigm for the future of AI-assisted sepsis diagnosis and other high-stakes medical decision making.
Semantic communication is poised to play a pivotal role in shaping the landscape of future AI-driven communication systems. Its challenge of extracting semantic information from the original complex content and regenerating semantically consistent data at the receiver, possibly being robust to channel corruptions, can be addressed with deep generative models. This ICASSP special session overview paper discloses the semantic communication challenges from the machine learning perspective and unveils how deep generative models will significantly enhance semantic communication frameworks in dealing with real-world complex data, extracting and exploiting semantic information, and being robust to channel corruptions. Alongside establishing this emerging field, this paper charts novel research pathways for the next generative semantic communication frameworks.
In order to address the lack of applicable channel models for ISAC research and evaluation, we release Sensiverse, a dataset that can be used for ISAC research. In this paper, we present the method of generating Sensiverse, including the acquisition and formatting of the 3D scene models, the generation of the channel data and associations with Tx/Rx deployment. The file structure and usage of the dataset are also described, and finally the use of the dataset is illustrated with examples through the evaluation of use cases such as 3D environment reconstruction and moving targets.
As one of the key techniques to realize semantic communications, end-to-end optimized neural joint source-channel coding (JSCC) has made great progress over the past few years. A general trend in many recent works pushing the model adaptability or the application diversity of neural JSCC is based on the convolutional neural network (CNN) backbone, whose model capacity is yet limited, inherently leading to inferior system coding gain against traditional coded transmission systems. In this paper, we establish a new neural JSCC backbone that can also adapt flexibly to diverse channel conditions and transmission rates within a single model, our open-source project aims to promote the research in this field. Specifically, we show that with elaborate design, neural JSCC codec built on the emerging Swin Transformer backbone achieves superior performance than conventional neural JSCC codecs built upon CNN, while also requiring lower end-to-end processing latency. Paired with two spatial modulation modules that scale latent representations based on the channel state information and target transmission rate, our baseline SwinJSCC can further upgrade to a versatile version, which increases its capability to adapt to diverse channel conditions and rate configurations. Extensive experimental results show that our SwinJSCC achieves better or comparable performance versus the state-of-the-art engineered BPG + 5G LDPC coded transmission system with much faster end-to-end coding speed, especially for high-resolution images, in which case traditional CNN-based JSCC yet falls behind due to its limited model capacity. \emph{Our open-source code and model are available at \href{https://github.com/semcomm/SwinJSCC}{https://github.com/semcomm/SwinJSCC}.}
Semantic communications, aiming at ensuring the successful delivery of the meaning of information, are expected to be one of the potential techniques for the next generation communications. However, the knowledge forming and synchronizing mechanism that enables semantic communication systems to extract and interpret the semantics of information according to the communication intents is still immature. In this paper, we propose a semantic image transmission framework with explicit semantic base (Seb), where Sebs are generated and employed as the knowledge shared between the transmitter and the receiver with flexible granularity. To represent images with Sebs, a novel Seb-based reference image generator is proposed to generate Sebs and then decompose the transmitted images. To further encode/decode the residual information for precise image reconstruction, a Seb-based image encoder/decoder is proposed. The key components of the proposed framework are optimized jointly by end-to-end (E2E) training, where the loss function is dedicated designed to tackle the problem of nondifferentiable operation in Seb-based reference image generator by introducing a gradient approximation mechanism. Extensive experiments show that the proposed framework outperforms state-of-art works by 0.5 - 1.5 dB in peak signal-to-noise ratio (PSNR) w.r.t. different signal-to-noise ratio (SNR).
Physical-Layer Authentication (PLA) has been recently believed as an endogenous-secure and energy-efficient technique to recognize IoT terminals. However, the major challenge of applying the state-of-the-art PLA schemes directly to 6G-enabled IoT is the inaccurate channel fingerprint estimation in low Signal-Noise Ratio (SNR) environments, which will greatly influence the reliability and robustness of PLA. To tackle this issue, we propose a configurable-fingerprint-based PLA architecture through Intelligent Reflecting Surface (IRS) that helps create an alternative wireless transmission path to provide more accurate fingerprints. According to Baye's theorem, we propose a Gaussian Process Classification (GPC)-based PLA scheme, which utilizes the Expectation Propagation (EP) method to obtain the identities of unknown fingerprints. Considering that obtaining sufficient labeled fingerprint samples to train the GPC-based authentication model is challenging for future 6G systems, we further extend the GPC-based PLA to the Efficient-GPC (EGPC)-based PLA through active learning, which requires fewer labeled fingerprints and is more feasible. We also propose three fingerprint selecting algorithms to choose fingerprints, whose identities are queried to the upper-layers authentication mechanisms. For this reason, the proposed EGPC-based scheme is also a lightweight cross-layer authentication method to offer a superior security level. The simulations conducted on synthetic datasets demonstrate that the IRS-assisted scheme reduces the authentication error rate by 98.69% compared to the non-IRS-based scheme. Additionally, the proposed fingerprint selection algorithms reduce the authentication error rate by 65.96% to 86.93% and 45.45% to 70.00% under perfect and imperfect channel estimation conditions, respectively, when compared with baseline algorithms.
Recent advances in wireless communication with the enormous demands of sensing ability have given rise to the integrated sensing and communication (ISAC) technology, among which passive sensing plays an important role. The main challenge of passive sensing is how to achieve high sensing performance in the condition of communication demodulation errors. In this paper, we propose an ISAC network (ISAC-NET) that combines passive sensing with communication signal detection by using model-driven deep learning (DL). Dissimilar to existing passive sensing algorithms that first demodulate the transmitted symbols and then obtain passive sensing results from the demodulated symbols, ISAC-NET obtains passive sensing results and communication demodulated symbols simultaneously. Different from the data-driven DL method, we adopt the block-by-block signal processing method that divides the ISAC-NET into the passive sensing module, signal detection module and channel reconstruction module. From the simulation results, ISAC-NET obtains better communication performance than the traditional signal demodulation algorithm, which is close to OAMP-Net2. Compared to the 2D-DFT algorithm, ISAC-NET demonstrates significantly enhanced sensing performance. In summary, ISAC-NET is a promising tool for passive sensing and communication in wireless communications.
The effectiveness of ad creatives is greatly influenced by their visual appearance. Advertising platforms can generate ad creatives with different appearances by combining creative elements provided by advertisers. However, with the increasing number of ad creative elements, it becomes challenging to select a suitable combination from the countless possibilities. The industry's mainstream approach is to select individual creative elements independently, which often overlooks the importance of interaction between creative elements during the modeling process. In response, this paper proposes a Cross-Element Combinatorial Selection framework for multiple creative elements, termed CECS. In the encoder process, a cross-element interaction is adopted to dynamically adjust the expression of a single creative element based on the current candidate creatives. In the decoder process, the creative combination problem is transformed into a cascade selection problem of multiple creative elements. A pointer mechanism with a cascade design is used to model the associations among candidates. Comprehensive experiments on real-world datasets show that CECS achieved the SOTA score on offline metrics. Moreover, the CECS algorithm has been deployed in our industrial application, resulting in a significant 6.02% CTR and 10.37% GMV lift, which is beneficial to the business.
Cortical surface registration plays a crucial role in aligning cortical functional and anatomical features across individuals. However, conventional registration algorithms are computationally inefficient. Recently, learning-based registration algorithms have emerged as a promising solution, significantly improving processing efficiency. Nonetheless, there remains a gap in the development of a learning-based method that exceeds the state-of-the-art conventional methods simultaneously in computational efficiency, registration accuracy, and distortion control, despite the theoretically greater representational capabilities of deep learning approaches. To address the challenge, we present SUGAR, a unified unsupervised deep-learning framework for both rigid and non-rigid registration. SUGAR incorporates a U-Net-based spherical graph attention network and leverages the Euler angle representation for deformation. In addition to the similarity loss, we introduce fold and multiple distortion losses, to preserve topology and minimize various types of distortions. Furthermore, we propose a data augmentation strategy specifically tailored for spherical surface registration, enhancing the registration performance. Through extensive evaluation involving over 10,000 scans from 7 diverse datasets, we showed that our framework exhibits comparable or superior registration performance in accuracy, distortion, and test-retest reliability compared to conventional and learning-based methods. Additionally, SUGAR achieves remarkable sub-second processing times, offering a notable speed-up of approximately 12,000 times in registering 9,000 subjects from the UK Biobank dataset in just 32 minutes. This combination of high registration performance and accelerated processing time may greatly benefit large-scale neuroimaging studies.
Driven by the vision of "intelligent connection of everything" toward 6G, the collective intelligence of networked machines can be fully exploited to improve system efficiency by shifting the paradigm of wireless communication design from naive maximalist approaches to intelligent value-based approaches. In this article, we propose an on-purpose machine communication framework enabled by joint communication, sensing, and computation (JCSC) technology, which employs machine semantics as the interactive information flow. Naturally, there are potential technical barriers to be solved before the widespread adoption of on-purpose communications, including the conception of machine purpose, fast and concise networking strategy, and semantics-aware information exchange mechanism during the process of task-oriented cooperation. Hence, we discuss enabling technologies complemented by a range of open challenges. The simulation result shows that the proposed framework can significantly reduce networking overhead and improve communication efficiency.