Abstract:This paper presents the results of the fourth edition of the Monocular Depth Estimation Challenge (MDEC), which focuses on zero-shot generalization to the SYNS-Patches benchmark, a dataset featuring challenging environments in both natural and indoor settings. In this edition, we revised the evaluation protocol to use least-squares alignment with two degrees of freedom to support disparity and affine-invariant predictions. We also revised the baselines and included popular off-the-shelf methods: Depth Anything v2 and Marigold. The challenge received a total of 24 submissions that outperformed the baselines on the test set; 10 of these included a report describing their approach, with most leading methods relying on affine-invariant predictions. The challenge winners improved the 3D F-Score over the previous edition's best result, raising it from 22.58% to 23.05%.
Abstract:More than the adherence to specific traffic regulations, driving culture touches upon a more implicit part - an informal, conventional, collective behavioral pattern followed by drivers - that varies across countries, regions, and even cities. Such cultural divergence has become one of the biggest challenges in deploying autonomous vehicles (AVs) across diverse regions today. The current emergence of data-driven methods has shown a potential solution to enable culture-compatible driving through learning from data, but what if some underdeveloped regions cannot provide sufficient local data to inform driving culture? This issue is particularly significant for a broader global AV market. Here, we propose a cross-cultural deployment scheme for AVs, called data-light inverse reinforcement learning, designed to re-calibrate culture-specific AVs and assimilate them into other cultures. First, we report the divergence in driving cultures through a comprehensive comparative analysis of naturalistic driving datasets on highways from three countries: Germany, China, and the USA. Then, we demonstrate the effectiveness of our scheme by testing the expeditious cross-cultural deployment across these three countries, with cumulative testing mileage of over 56084 km. The performance is particularly advantageous when cross-cultural deployment is carried out without affluent local data. Results show that we can reduce the dependence on local data by a margin of 98.67% at best. This study is expected to bring a broader, fairer AV global market, particularly in those regions that lack enough local data to develop culture-compatible AVs.
Abstract:Accurate spatiotemporal calibration is a prerequisite for multisensor fusion. However, sensors are typically asynchronous, and there is no overlap between the fields of view of cameras and LiDARs, posing challenges for intrinsic and extrinsic parameter calibration. To address this, we propose a calibration pipeline based on continuous-time and bundle adjustment (BA) capable of simultaneous intrinsic and extrinsic calibration (6 DOF transformation and time offset). We do not require overlapping fields of view or any calibration board. Firstly, we establish data associations between cameras using Structure from Motion (SFM) and perform self-calibration of camera intrinsics. Then, we establish data associations between LiDARs through adaptive voxel map construction, optimizing for extrinsic calibration within the map. Finally, by matching features between the intensity projection of LiDAR maps and camera images, we conduct joint optimization for intrinsic and extrinsic parameters. This pipeline functions in texture-rich structured environments, allowing simultaneous calibration of any number of cameras and LiDARs without the need for intricate sensor synchronization triggers. Experimental results demonstrate our method's ability to fulfill co-visibility and motion constraints between sensors without accumulating errors.
Abstract:Parallel batch processing machines have extensive applications in the semiconductor manufacturing process. However, the problem models in previous studies regard parallel batch processing as a fixed processing stage in the machining process. This study generalizes the problem model, in which users can arbitrarily set certain stages as parallel batch processing stages according to their needs. A Hybrid Flow Shop Scheduling Problem with Parallel Batch Processing Machines (PBHFSP) is solved in this paper. Furthermore, an Adaptive Knowledge-based Multi-Objective Evolutionary Algorithm (AMOEA/D) is designed to simultaneously optimize both makespan and Total Energy Consumption (TEC). Firstly, a hybrid initialization strategy with heuristic rules based on knowledge of PBHFSP is proposed to generate promising solutions. Secondly, the disjunctive graph model has been established based on the knowledge to find the critical-path of PBHFS. Then, a critical-path based neighborhood search is proposed to enhance the exploitation ability of AMOEA/D. Moreover, the search time is adaptively adjusted based on learning experience from Q-learning and Decay Law. Afterward, to enhance the exploration capability of the algorithm, AMOEA/D designs an improved population updating strategy with a weight vector updating strategy. These strategies rematch individuals with weight vectors, thereby maintaining the diversity of the population. Finally, the proposed algorithm is compared with state-of-the-art algorithms. The experimental results show that the AMOEA/D is superior to the comparison algorithms in solving the PBHFSP.
Abstract:In a flexible job shop environment, using Automated Guided Vehicles (AGVs) to transport jobs and process materials is an important way to promote the intelligence of the workshop. Compared with single-load AGVs, multi-load AGVs can improve AGV utilization, reduce path conflicts, etc. Therefore, this study proposes a history-guided regional partitioning algorithm (HRPEO) for the flexible job shop scheduling problem with limited multi-load AGVs (FJSPMA). First, the encoding and decoding rules are designed according to the characteristics of multi-load AGVs, and then the initialization rule based on the branch and bound method is used to generate the initial population. Second, to prevent the algorithm from falling into a local optimum, the algorithm adopts a regional partitioning strategy. This strategy divides the solution space into multiple regions and measures the potential of the regions. After that, cluster the regions into multiple clusters in each iteration, and selects individuals for evolutionary search based on the set of clusters. Third, a local search strategy is designed to improve the exploitation ability of the algorithm, which uses a greedy approach to optimize machines selection and transportation sequence according to the characteristics of FJSPMA. Finally, a large number of experiments are carried out on the benchmarks to test the performance of the algorithm. Compared with multiple advanced algorithms, the results show that the HRPEO has a better advantage in solving FJSPMA.
Abstract:Coherent technology inherent with more availabledegrees of freedom is deemed a competitive solution for nextgeneration ultra-high-speed short-reach optical interconnects.However, the fatal barriers to implementing the conventiona.coherent system in short-reach optical interconnect are the costfootprint, and power consumption. Self-homodyne coherentsystem exhibits its potential to reduce the power consumption ofthe receiver-side digital signal processing (Rx-DSP) by deliveringthe local oscillator (LO) from the transmitter. However, anautomatic polarization controller (APC) is inevitable in the remoteLO link to avoid polarization fading, resulting in additional costsTo address the polarization fading issue, a simplified self.homodyne coherent system is proposed enabled by Alamouticoding in this paper. Benefiting from the Alamouti coding betweentwo polarizations, a polarization-insensitive receiver onlyincluding a 3dB coupler, a 90o Hybrid, and two balancedphotodiodes (BPDs)is sufficient for reception. Meanwhile, theAPC in the LO link is needless, simplifying the receiver structuresignificantly. Besides, the digital subcarrier multiplexing (DSCM)technique is also adopted to relax the computational complexity ofthe chromatic dispersion compensation (CDC), which is one of thedominant power consumption modules in Rx-DSP. Thetransmission performance of 50Gbaud 4-subcarrier 16/32OAM(4SC-16/320AM) DSCM signal based on the proposed simplifiedself-homodyne coherent system is investigated experimentallyThe results show that the bit-error-ratio(BER) performancedegradation caused by CD can be solved by increasing 4 taps inthe equalizer for 80km single mode fiber(SMF)transmissionwithout individual CDC, which operates in a low-complexitymanner.
Abstract:Distributed acoustic sensing (DAS) is a novel enabling technology that can turn existing fibre optic networks to distributed acoustic sensors. However, it faces the challenges of transmitting, storing, and processing massive streams of data which are orders of magnitude larger than that collected from point sensors. The gap between intensive data generated by DAS and modern computing system with limited reading/writing speed and storage capacity imposes restrictions on many applications. Compressive sensing (CS) is a revolutionary signal acquisition method that allows a signal to be acquired and reconstructed with significantly fewer samples than that required by Nyquist-Shannon theorem. Though the data size is greatly reduced in the sampling stage, the reconstruction of the compressed data is however time and computation consuming. To address this challenge, we propose to map the feature extractor from Nyquist-domain to compressed-domain and therefore vibration detection and classification can be directly implemented in compressed-domain. The measured results show that our framework can be used to reduce the transmitted data size by 70% while achieves 99.4% true positive rate (TPR) and 0.04% false positive rate (TPR) along 5 km sensing fibre and 95.05% classification accuracy on a 5-class classification task.
Abstract:Developing autonomous vehicles (AVs) helps improve the road safety and traffic efficiency of intelligent transportation systems (ITS). Accurately predicting the trajectories of traffic participants is essential to the decision-making and motion planning of AVs in interactive scenarios. Recently, learning-based trajectory predictors have shown state-of-the-art performance in highway or urban areas. However, most existing learning-based models trained with fixed datasets may perform poorly in continuously changing scenarios. Specifically, they may not perform well in learned scenarios after learning the new one. This phenomenon is called "catastrophic forgetting". Few studies investigate trajectory predictions in continuous scenarios, where catastrophic forgetting may happen. To handle this problem, first, a novel continual learning (CL) approach for vehicle trajectory prediction is proposed in this paper. Then, inspired by brain science, a dynamic memory mechanism is developed by utilizing the measurement of traffic divergence between scenarios, which balances the performance and training efficiency of the proposed CL approach. Finally, datasets collected from different locations are used to design continual training and testing methods in experiments. Experimental results show that the proposed approach achieves consistently high prediction accuracy in continuous scenarios without re-training, which mitigates catastrophic forgetting compared to non-CL approaches. The implementation of the proposed approach is publicly available at https://github.com/BIT-Jack/D-GSM
Abstract:Trajectory prediction is a fundamental problem and challenge for autonomous vehicles. Early works mainly focused on designing complicated architectures for deep-learning-based prediction models in normal-illumination environments, which fail in dealing with low-light conditions. This paper proposes a novel approach for trajectory prediction in low-illumination scenarios by leveraging multi-stream information fusion, which flexibly integrates image, optical flow, and object trajectory information. The image channel employs Convolutional Neural Network (CNN) and Long Short-term Memory (LSTM) networks to extract temporal information from the camera. The optical flow channel is applied to capture the pattern of relative motion between adjacent camera frames and modelled by Spatial-Temporal Graph Convolutional Network (ST-GCN). The trajectory channel is used to recognize high-level interactions between vehicles. Finally, information from all the three channels is effectively fused in the prediction module to generate future trajectories of surrounding vehicles in low-illumination conditions. The proposed multi-channel graph convolutional approach is validated on HEV-I and newly generated Dark-HEV-I, egocentric vision datasets that primarily focus on urban intersection scenarios. The results demonstrate that our method outperforms the baselines, in standard and low-illumination scenarios. Additionally, our approach is generic and applicable to scenarios with different types of perception data. The source code of the proposed approach is available at https://github.com/TommyGong08/MSIF}{https://github.com/TommyGong08/MSIF.
Abstract:In urban environments, the complex and uncertain intersection scenarios are challenging for autonomous driving. To ensure safety, it is crucial to develop an adaptive decision making system that can handle the interaction with other vehicles. Manually designed model-based methods are reliable in common scenarios. But in uncertain environments, they are not reliable, so learning-based methods are proposed, especially reinforcement learning (RL) methods. However, current RL methods need retraining when the scenarios change. In other words, current RL methods cannot reuse accumulated knowledge. They forget learned knowledge when new scenarios are given. To solve this problem, we propose a hierarchical framework that can autonomously accumulate and reuse knowledge. The proposed method combines the idea of motion primitives (MPs) with hierarchical reinforcement learning (HRL). It decomposes complex problems into multiple basic subtasks to reduce the difficulty. The proposed method and other baseline methods are tested in a challenging intersection scenario based on the CARLA simulator. The intersection scenario contains three different subtasks that can reflect the complexity and uncertainty of real traffic flow. After offline learning and testing, the proposed method is proved to have the best performance among all methods.