Abstract:We present EBench, a simulation benchmark that diagnoses generalist mobile manipulation policies beyond a single success-rate scalar. EBench comprises 26 diverse and challenging manipulation tasks annotated along 5 capability dimensions and 4 generalization dimensions. We evaluate state-of-the-art generalist manipulation models including $π_0$, $π_{0.5}$, XVLA, and InternVLA-A1, and reveal that models with near success rates exhibit strikingly different capability profiles: $π_{0.5}$ achieves the highest test success rate and the best train--test retention, whereas InternVLA-A1 dominates mobile manipulation but collapses on dexterous tasks, and XVLA exhibits strengths on a disjoint set of atomic skills compared to other policies. Beyond capability profiling, EBench analyzes the generalization ability from 4 representative perspectives, identifying the impact of different distribution shift factors. The results reveal strengths and weaknesses of models behind an overall score. We hope this benchmark offers a broad set of diagnostic signals to guide iteration on generalist manipulation models.
Abstract:The detection of underwater targets is severely affected by the non-uniform spatial characteristics of marine environmental noise. Additionally, the presence of both natural and anthropogenic acoustic sources, including shipping traffic, marine life, and geological activity, further complicates the underwater acoustic landscape. Addressing these challenges requires advanced underwater sensors and robust signal processing techniques. In this paper, we present a novel approach that leverages an optical fiber distributed acoustic sensing (DAS) system combined with a broadband generalized sparse covariance-fitting framework for underwater target direction sensing, particularly focusing on robustness against non-uniform noise. The DAS system incorporates a newly developed spiral-sensitized optical cable, which significantly improves sensitivity compared to conventional submarine cables. This innovative design enables the system to capture acoustic signals with greater precision. Notably, the sensitivity of the spiral-wound sensitized cable is around -145.69 dB re: 1 rad / (uPa*m), as measured inside the standing-wave tube. Employing simulations, we assess the performance of the algorithm across diverse noise levels and target configurations, consistently revealing higher accuracy and reduced background noise compared to conventional beamforming techniques and other sparse techniques. In a controlled pool experiment, the correlation coefficient between waveforms acquired by the DAS system and a standard hydrophone reached 0.973, indicating high fidelity in signal capture.




Abstract:There has been growing interest in facial video-based remote photoplethysmography (rPPG) measurement recently, with a focus on assessing various vital signs such as heart rate and heart rate variability. Despite previous efforts on static datasets, their approaches have been hindered by inaccurate region of interest (ROI) localization and motion issues, and have shown limited generalization in real-world scenarios. To address these challenges, we propose a novel masked attention regularization (MAR-rPPG) framework that mitigates the impact of ROI localization and complex motion artifacts. Specifically, our approach first integrates a masked attention regularization mechanism into the rPPG field to capture the visual semantic consistency of facial clips, while it also employs a masking technique to prevent the model from overfitting on inaccurate ROIs and subsequently degrading its performance. Furthermore, we propose an enhanced rPPG expert aggregation (EREA) network as the backbone to obtain rPPG signals and attention maps simultaneously. Our EREA network is capable of discriminating divergent attentions from different facial areas and retaining the consistency of spatiotemporal attention maps. For motion robustness, a simple open source detector MediaPipe for data preprocessing is sufficient for our framework due to its superior capability of rPPG signal extraction and attention regularization. Exhaustive experiments on three benchmark datasets (UBFC-rPPG, PURE, and MMPD) substantiate the superiority of our proposed method, outperforming recent state-of-the-art works by a considerable margin.