Abstract:Optical systems have been pivotal for energy-efficient computing, performing high-speed, parallel operations in low-loss carriers. While these predominantly analog optical accelerators bypass digitization to perform parallel floating-point computations, scaling optical hardware to map large-vector sizes for AI tasks remains challenging. Here, we overcome this limitation by unfolding scalar operations in time and introducing a photonic-heater-in-lightpath (PHIL) unit for all-optical temporal integration. Counterintuitively, we exploit a slow heat dissipation process to integrate optical signals modulated at 50 GHz bridging the speed gap between the widely applied thermo-optic effects and ultrafast photonics. This architecture supports optical end-to-end signal processing, eliminates inefficient electro-optical conversions, and enables both linear and nonlinear operations within a unified framework. Our results demonstrate a scalable path towards high-speed photonic computing through thermally driven integration.
Abstract:Image restoration represents a fundamental challenge in low-level vision, focusing on reconstructing high-quality images from their degraded counterparts. With the rapid advancement of deep learning technologies, transformer-based methods with pyramid structures have advanced the field by capturing long-range cross-scale spatial interaction. Despite its popularity, the degradation of essential features during the upsampling process notably compromised the restoration performance, resulting in suboptimal reconstruction outcomes. We introduce the EchoIR, an UNet-like image restoration network with a bilateral learnable upsampling mechanism to bridge this gap. Specifically, we proposed the Echo-Upsampler that optimizes the upsampling process by learning from the bilateral intermediate features of U-Net, the "Echo", aiming for a more refined restoration by minimizing the degradation during upsampling. In pursuit of modeling a hierarchical model of image restoration and upsampling tasks, we propose the Approximated Sequential Bi-level Optimization (AS-BLO), an advanced bi-level optimization model establishing a relationship between upsampling learning and image restoration tasks. Extensive experiments against the state-of-the-art (SOTA) methods demonstrate the proposed EchoIR surpasses the existing methods, achieving SOTA performance in image restoration tasks.
Abstract:Single image super-resolution (SR) has long posed a challenge in the field of computer vision. While the advent of deep learning has led to the emergence of numerous methods aimed at tackling this persistent issue, the current methodologies still encounter challenges in modeling long sequence information, leading to limitations in effectively capturing the global pixel interactions. To tackle this challenge and achieve superior SR outcomes, we propose the Mamba pixel-wise sequential interaction network (MPSI), aimed at enhancing the establishment of long-range connections of information, particularly focusing on pixel-wise sequential interaction. We propose the Channel-Mamba Block (CMB) to capture comprehensive pixel interaction information by effectively modeling long sequence information. Moreover, in the existing SR methodologies, there persists the issue of the neglect of features extracted by preceding layers, leading to the loss of valuable feature information. While certain existing models strive to preserve these features, they frequently encounter difficulty in establishing connections across all layers. To overcome this limitation, MPSI introduces the Mamba channel recursion module (MCRM), which maximizes the retention of valuable feature information from early layers, thereby facilitating the acquisition of pixel sequence interaction information from multiple-level layers. Through extensive experimentation, we demonstrate that MPSI outperforms existing super-resolution methods in terms of image reconstruction results, attaining state-of-the-art performance.