Nonuniform rotational distortion (NURD) correction is vital for endoscopic optical coherence tomography (OCT) imaging and its functional extensions, such as angiography and elastography. Current NURD correction methods require time-consuming feature tracking or cross-correlation calculations and thus sacrifice temporal resolution. Here we propose a cross-attention learning method for the NURD correction in OCT. Our method is inspired by the recent success of the self-attention mechanism in natural language processing and computer vision. By leveraging its ability to model long-range dependencies, we can directly obtain the correlation between OCT A-lines at any distance, thus accelerating the NURD correction. We develop an end-to-end stacked cross-attention network and design three types of optimization constraints. We compare our method with two traditional feature-based methods and a CNN-based method, on two publicly-available endoscopic OCT datasets and a private dataset collected on our home-built endoscopic OCT system. Our method achieved a $\sim3\times$ speedup to real time ($26\pm 3$ fps), and superior correction performance.
Deep learning has been successfully applied to OCT segmentation. However, for data from different manufacturers and imaging protocols, and for different regions of interest (ROIs), it requires laborious and time-consuming data annotation and training, which is undesirable in many scenarios, such as surgical navigation and multi-center clinical trials. Here we propose an annotation-efficient learning method for OCT segmentation that could significantly reduce annotation costs. Leveraging self-supervised generative learning, we train a Transformer-based model to learn the OCT imagery. Then we connect the trained Transformer-based encoder to a CNN-based decoder, to learn the dense pixel-wise prediction in OCT segmentation. These training phases use open-access data and thus incur no annotation costs, and the pre-trained model can be adapted to different data and ROIs without re-training. Based on the greedy approximation for the k-center problem, we also introduce an algorithm for the selective annotation of the target data. We verified our method on publicly-available and private OCT datasets. Compared to the widely-used U-Net model with 100% training data, our method only requires ~10% of the data for achieving the same segmentation accuracy, and it speeds the training up to ~3.5 times. Furthermore, our proposed method outperforms other potential strategies that could improve annotation efficiency. We think this emphasis on learning efficiency may help improve the intelligence and application penetration of OCT-based technologies. Our code and pre-trained model are publicly available at https://github.com/SJTU-Intelligent-Optics-Lab/Annotation-efficient-learning-for-OCT-segmentation.