Computer-assisted automatic analysis of diabetic retinopathy (DR) is of great importance in reducing the risks of vision loss and even blindness. Ultra-wide optical coherence tomography angiography (UW-OCTA) is a non-invasive and safe imaging modality in DR diagnosis system, but there is a lack of publicly available benchmarks for model development and evaluation. To promote further research and scientific benchmarking for diabetic retinopathy analysis using UW-OCTA images, we organized a challenge named "DRAC - Diabetic Retinopathy Analysis Challenge" in conjunction with the 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2022). The challenge consists of three tasks: segmentation of DR lesions, image quality assessment and DR grading. The scientific community responded positively to the challenge, with 11, 12, and 13 teams from geographically diverse institutes submitting different solutions in these three tasks, respectively. This paper presents a summary and analysis of the top-performing solutions and results for each task of the challenge. The obtained results from top algorithms indicate the importance of data augmentation, model architecture and ensemble of networks in improving the performance of deep learning models. These findings have the potential to enable new developments in diabetic retinopathy analysis. The challenge remains open for post-challenge registrations and submissions for benchmarking future methodology developments.
The joint communication and sensing (JCS) system can provide higher spectrum efficiency and load-saving for 6G machine-type communication (MTC) applications by merging necessary communication and sensing abilities with unified spectrum and transceivers. In order to suppress the mutual interference between the communication and radar sensing signals to improve the communication reliability and radar sensing accuracy, we propose a novel code-division orthogonal frequency division multiplex (CD-OFDM) JCS MTC system, where MTC users can simultaneously and continuously conduct communication and sensing with each other. {\color{black} We propose a novel CD-OFDM JCS signal and corresponding successive-interference-cancellation (SIC) based signal processing technique that obtains code-division multiplex (CDM) gain, which is compatible with the prevalent orthogonal frequency division multiplex (OFDM) communication system.} To model the unified JCS signal transmission and reception process, we propose a novel unified JCS channel model. Finally, the simulation and numerical results are shown to verify the feasibility of the CD-OFDM JCS MTC system {\color{black} and the error propagation performance}. We show that the CD-OFDM JCS MTC system can achieve not only more reliable communication but also comparably robust radar sensing compared with the precedent OFDM JCS system, especially in low signal-to-interference-and-noise ratio (SINR) regime.
Recent advances in deep learning have led to increased interest in solving high-efficiency end-to-end transmission problems using methods that employ the nonlinear property of neural networks. These methods, we call semantic coding, extract semantic features of the source signal across space and time, and design source-channel coding methods to transmit these features over wireless channels. Rapid progress has led to numerous research papers, but a consolidation of the discovered knowledge has not yet emerged. In this article, we gather ideas to categorize the expansive aspects on semantic coding as two paradigms, i.e., explicit and implicit semantic coding. We first focus on those two paradigms of semantic coding by identifying their common and different components in building semantic communication systems. We then focus on the applications of semantic coding to different transmission tasks. Our article highlights the improved quality, flexibility, and capability brought by semantic coded transmission. Finally, we point out future directions.
Recent deep learning methods have led to increased interest in solving high-efficiency end-to-end transmission problems. These methods, we call nonlinear transform source-channel coding (NTSCC), extract the semantic latent features of source signal, and learn entropy model to guide the joint source-channel coding with variable rate to transmit latent features over wireless channels. In this paper, we propose a comprehensive framework for improving NTSCC, thereby higher system coding gain, better model versatility, and more flexible adaptation strategy aligned with semantic guidance are all achieved. This new sophisticated NTSCC model is now ready to support large-size data interaction in emerging XR, which catalyzes the application of semantic communications. Specifically, we propose three useful improvement approaches. First, we introduce a contextual entropy model to better capture the spatial correlations among the semantic latent features, thereby more accurate rate allocation and contextual joint source-channel coding are developed accordingly to enable higher coding gain. On that basis, we further propose response network architectures to formulate versatile NTSCC, i.e., once-trained model supports various rates and channel states that benefits the practical deployment. Following this, we propose an online latent feature editing method to enable more flexible coding rate control aligned with some specific semantic guidance. By comprehensively applying the above three improvement methods for NTSCC, a deployment-friendly semantic coded transmission system stands out finally. Our improved NTSCC system has been experimentally verified to achieve 16.35% channel bandwidth saving versus the state-of-the-art engineered VTM + 5G LDPC coded transmission system with lower processing latency.
Semantic communication serves as a novel paradigm and attracts the broad interest of researchers. One critical aspect of it is the multi-user semantic communication theory, which can further promote its application to the practical network environment. While most existing works focused on the design of end-to-end single-user semantic transmission, a novel non-orthogonal multiple access (NOMA)-based multi-user semantic communication system named NOMASC is proposed in this paper. The proposed system can support semantic tranmission of multiple users with diverse modalities of source information. To avoid high demand for hardware, an asymmetric quantizer is employed at the end of the semantic encoder for discretizing the continuous full-resolution semantic feature. In addition, a neural network model is proposed for mapping the discrete feature into self-learned symbols and accomplishing intelligent multi-user detection (MUD) at the receiver. Simulation results demonstrate that the proposed system holds good performance in non-orthogonal transmission of multiple user signals and outperforms the other methods, especially at low-to-medium SNRs. Moreover, it has high robustness under various simulation settings and mismatched test scenarios.
With the rapid development of the smart city, high-level autonomous driving, intelligent manufacturing, and etc., the stringent industrial-level requirements of the extremely low latency and high reliability for communication and new trends for sub-centimeter sensing have transcended the abilities of 5G and call for the development of 6G. Based on analyzing the function design of the communication, sensing and the emerging intelligent computation systems, we propose the joint communication, sensing and computation (JCSC) framework for 6G intelligent machine-type communication (IMTC) network to realize low latency and high reliability of communication, highly accurate sensing and fast environment adaption. In the proposed JCSC framework, the communication, sensing and computation abilities cooperate to benefit each other by utilizing the unified hardware, resource and protocol design. Sensing information is exploited as priori information to enhance the reliability and latency performance of wireless communication and to optimize the resource utilization of the communication network, which further improves the distributed computation and cooperative sensing ability. We propose the promising enabling technologies such as joint communication and sensing (JCS) technique, JCSC wireless networking techniques and intelligent computation techniques. We also summarize the challenges to achieve the JCSC framework. Then, we introduce the intelligent flexible manufacturing as a typical use case of the IMTC with JCSC framework, where the enabling technologies are deployed. Finally, we present the simulation results to prove the feasibility of the JCSC framework by evaluating the JCS waveform, the JCSC enabled neighbor discovery (ND) and medium access control (MAC).
As machine learning (ML) algorithms are increasingly used in high-stakes applications, concerns have arisen that they may be biased against certain social groups. Although many approaches have been proposed to make ML models fair, they typically rely on the assumption that data distributions in training and deployment are identical. Unfortunately, this is commonly violated in practice and a model that is fair during training may lead to an unexpected outcome during its deployment. Although the problem of designing robust ML models under dataset shifts has been widely studied, most existing works focus only on the transfer of accuracy. In this paper, we study the transfer of both fairness and accuracy under domain generalization where the data at test time may be sampled from never-before-seen domains. We first develop theoretical bounds on the unfairness and expected loss at deployment, and then derive sufficient conditions under which fairness and accuracy can be perfectly transferred via invariant representation learning. Guided by this, we design a learning algorithm such that fair ML models learned with training data still have high fairness and accuracy when deployment environments change. Experiments on real-world data validate the proposed algorithm. Model implementation is available at https://github.com/pth1993/FATDM.
It has become a consensus that autonomous vehicles (AVs) will first be widely deployed on highways. However, the complexity of highway interchanges becomes the bottleneck for deploying AVs. An AV should be sufficiently tested under different highway interchanges, which is still challenging due to the lack of available datasets containing diverse highway interchanges. In this paper, we propose a model-driven method, FLYOVER, to generate a dataset consisting of diverse interchanges with measurable diversity coverage. First, FLYOVER proposes a labeled digraph to model the topology of an interchange. Second, FLYOVER takes real-world interchanges as input to guarantee topology practicality and extracts different topology equivalence classes by classifying the corresponding topology models. Third, for each topology class, FLYOVER identifies the corresponding geometrical features for the ramps and generates concrete interchanges using k-way combinatorial coverage and differential evolution. To illustrate the diversity and applicability of the generated interchange dataset, we test the built-in traffic flow control algorithm in SUMO and the fuel-optimization trajectory tracking algorithm deployed to Alibaba's autonomous trucks on the dataset. The results show that except for the geometrical difference, the interchanges are diverse in throughput and fuel consumption under the traffic flow control and trajectory tracking algorithms, respectively.
We propose a novel neural waveform compression method to catalyze emerging speech semantic communications. By introducing nonlinear transform and variational modeling, we effectively capture the dependencies within speech frames and estimate the probabilistic distribution of the speech feature more accurately, giving rise to better compression performance. In particular, the speech signals are analyzed and synthesized by a pair of nonlinear transforms, yielding latent features. An entropy model with hyperprior is built to capture the probabilistic distribution of latent features, followed with quantization and entropy coding. The proposed waveform codec can be optimized flexibly towards arbitrary rate, and the other appealing feature is that it can be easily optimized for any differentiable loss function, including perceptual loss used in semantic communications. To further improve the fidelity, we incorporate residual coding to mitigate the degradation arising from quantization distortion at the latent space. Results indicate that achieving the same performance, the proposed method saves up to 27% coding rate than widely used adaptive multi-rate wideband (AMR-WB) codec as well as emerging neural waveform coding methods.
Internet of Vehicles (IoV) is expected to become the central infrastructure to provide advanced services to connected vehicles and users for higher transportation efficiency and security. A variety of emerging applications/services bring explosively growing demands for mobile data traffic between connected vehicles and roadside units (RSU), imposing the significant challenge of spectrum scarcity to IoV. In this paper, we propose a cooperative semantic-aware architecture to convey essential semantics from collaborated users to servers for lowering the data traffic. In contrast to current solutions that are mainly based on piling up highly complex signal processing techniques and multiple access capabilities in terms of syntactic communications, this paper puts forth the idea of semantic-aware content delivery in IoV. Specifically, the successful transmission of essential semantics of the source data is pursued, rather than the accurate reception of symbols regardless of its meaning as in conventional syntactic communications. To assess the benefits of the proposed architecture, we provide a case study of the image retrieval task for vehicles in intelligent transportation systems. Simulation results demonstrate that the proposed architecture outperforms the existing solutions with fewer radio resources, especially in a low signal-to-noise-ratio (SNR) regime, which can shed light on the potential of the proposed architecture in extending the applications in extreme environments.