In target speaker extraction, many studies rely on the speaker embedding which is obtained from an enrollment of the target speaker and employed as the guidance. However, solely using speaker embedding may not fully utilize the contextual information contained in the enrollment. In this paper, we directly exploit this contextual information in the time-frequency (T-F) domain. Specifically, the T-F representations of the enrollment and the mixed signal are interacted to compute the weighting matrices through an attention mechanism. These weighting matrices reflect the similarity among different frames of the T-F representations and are further employed to obtain the consistent T-F representations of the enrollment. These consistent representations are served as the guidance, allowing for better exploitation of the contextual information. Furthermore, the proposed method achieves the state-of-the-art performance on the benchmark dataset and shows its effectiveness in the complex scenarios.
Parallel Quantum Annealing is a technique to solve multiple optimization problems simultaneously. Parallel quantum annealing aims to optimize the utilization of available qubits on a quantum topology by addressing multiple independent problems in a single annealing cycle. This study provides insights into the potential and the limitations of this parallelization method. The experiments consisting of two different problems are integrated, and various problem dimensions are explored including normalization techniques using specific methods such as DWaveSampler with Default Embedding, DWaveSampler with Custom Embedding and LeapHybridSampler. This method minimizes idle qubits and holds promise for substantial speed-up, as indicated by the Time-to-Solution (TTS) metric, compared to traditional quantum annealing, which solves problems sequentially and may leave qubits unutilized.
Background: Cardiovascular magnetic resonance imaging (CMR) is a well-established imaging tool for diagnosing and managing cardiac conditions. The integration of exercise stress with CMR (ExCMR) can enhance its diagnostic capacity. Despite recent advances in CMR technology, ExCMR remains technically challenging due to motion artifacts and limited spatial and temporal resolution. Methods: This study investigates the feasibility of biventricular functional and hemodynamic assessment using real-time (RT) ExCMR during a staged exercise protocol in 26 healthy volunteers. We introduce a coil reweighting technique to minimize motion artifacts. In addition, we identify and analyze heartbeats from the end-expiratory phase to enhance the repeatability of cardiac function quantification. To demonstrate clinical feasibility, qualitative results from five patients are also presented. Results: Our findings indicate a consistent decrease in end-systolic volume (ESV) and stable end-diastolic volume (EDV) across exercise intensities, leading to increased stroke volume (SV) and ejection fraction (EF). Coil reweighting effectively reduces motion artifacts, improving image quality in both healthy volunteers and patients. The repeatability of cardiac function parameters, demonstrated by scan-rescan tests in nine volunteers, improves with the selection of end-expiratory beats. Conclusions: The study demonstrates that RT ExCMR with in-magnet exercise is a feasible and effective method for dynamic cardiac function monitoring during exercise. The proposed coil reweighting technique and selection of end-expiratory beats significantly enhance image quality and repeatability.
Deep Learning is advancing medical imaging Research and Development (R&D), leading to the frequent clinical use of Artificial Intelligence/Machine Learning (AI/ML)-based medical devices. However, to advance AI R&D, two challenges arise: 1) significant data imbalance, with most data from Europe/America and under 10% from Asia, despite its 60% global population share; and 2) hefty time and investment needed to curate proprietary datasets for commercial use. In response, we established the first commercial medical imaging platform, encompassing steps like: 1) data collection, 2) data selection, 3) annotation, and 4) pre-processing. Moreover, we focus on harnessing under-represented data from Japan and broader Asia, including Computed Tomography, Magnetic Resonance Imaging, and Whole Slide Imaging scans. Using the collected data, we are preparing/providing ready-to-use datasets for medical AI R&D by 1) offering these datasets to AI firms, biopharma, and medical device makers and 2) using them as training/test data to develop tailored AI solutions for such entities. We also aim to merge Blockchain for data security and plan to synthesize rare disease data via generative AI. DataHub Website: https://medical-datahub.ai/
Textureless object recognition has become a significant task in Computer Vision with the advent of Robotics and its applications in manufacturing sector. It has been challenging to obtain good accuracy in real time because of its lack of discriminative features and reflectance properties which makes the techniques for textured object recognition insufficient for textureless objects. A lot of work has been done in the last 20 years, especially in the recent 5 years after the TLess and other textureless dataset were introduced. In this project, by applying image processing techniques we created a robust augmented dataset from initial imbalanced smaller dataset. We extracted edge features, feature combinations and RGB images enhanced with feature/feature combinations to create 15 datasets, each with a size of ~340,000. We then trained four classifiers on these 15 datasets to arrive at a conclusion as to which dataset performs the best overall and whether edge features are important for textureless objects. Based on our experiments and analysis, RGB images enhanced with combination of 3 edge features performed the best compared to all others. Model performance on dataset with HED edges performed comparatively better than other edge detectors like Canny or Prewitt.
This paper aims to get a comprehensive review of current-day robotic computation technologies at VLSI architecture level. We studied several repots in the domain of robotic processor architecture. In this work, we focused on the forward kinematics architectures which consider CORDIC algorithms, VLSI circuits of WE DSP16 chip, parallel processing and pipelined architecture, and lookup table formula and FPGA processor. This study gives us an understanding of different implementation methods for forward kinematics. Our goal is to develop a forward kinematics processor with FPGA for real-time applications, requires a fast response time and low latency of these devices, useful for industrial automation where the processing speed plays a great role.
Abrupt maneuvers by surrounding vehicles (SVs) can typically lead to safety concerns and affect the task efficiency of the ego vehicle (EV), especially with model uncertainties stemming from environmental disturbances. This paper presents a real-time fail-operational controller that ensures the asymptotic convergence of an uncertain EV to a safe state, while preserving task efficiency in dynamic environments. An incremental Bayesian learning approach is developed to facilitate online learning and inference of changing environmental disturbances. Leveraging disturbance quantification and constraint transformation, we develop a stochastic fail-operational barrier based on the control barrier function (CBF). With this development, the uncertain EV is able to converge asymptotically from an unsafe state to a defined safe state with probabilistic stability. Subsequently, the stochastic fail-operational barrier is integrated into an efficient fail-operational controller based on quadratic programming (QP). This controller is tailored for the EV operating under control constraints in the presence of environmental disturbances, with both safety and efficiency objectives taken into consideration. We validate the proposed framework in connected cruise control (CCC) tasks, where SVs perform aggressive driving maneuvers. The simulation results demonstrate that our method empowers the EV to swiftly return to a safe state while upholding task efficiency in real time, even under time-varying environmental disturbances.
The development of acoustic simulation workflows in the time-domain description is essential for predicting the sound of aeroacoustic or other transient acoustic effects. A common practice for noise mitigation is using absorbers. The modeling of these acoustic absorbers is typically provided in the frequency domain. Several, methods established bridging this gap, investigating methods to model absorber in the time domain. Therefore, this short article, describes the analytic solution in time-domain for benchmarking absorber simulations with infinite 1D, 2D, and 3D domains. Connected to the analytic solution, a Matlab script is provided to easily obtain the reference solution. The reference codes are provided as benchmark solution in the EAA TCCA Benchmarking database as METAMAT 01.
Deep reinforcement learning excels in numerous large-scale practical applications. However, existing performance analyses ignores the unique characteristics of continuous-time control problems, is unable to directly estimate the generalization error of the Bellman optimal loss and require a boundedness assumption. Our work focuses on continuous-time control problems and proposes a method that is applicable to all such problems where the transition function satisfies semi-group and Lipschitz properties. Under this method, we can directly analyze the \emph{a priori} generalization error of the Bellman optimal loss. The core of this method lies in two transformations of the loss function. To complete the transformation, we propose a decomposition method for the maximum operator. Additionally, this analysis method does not require a boundedness assumption. Finally, we obtain an \emph{a priori} generalization error without the curse of dimensionality.
Generative probabilistic forecasting produces future time series samples according to the conditional probability distribution given past time series observations. Such techniques are essential in risk-based decision-making and planning under uncertainty with broad applications in grid operations, including electricity price forecasting, risk-based economic dispatch, and stochastic optimizations. Inspired by Wiener and Kallianpur's innovation representation, we propose a weak innovation autoencoder architecture and a learning algorithm to extract independent and identically distributed innovation sequences from nonparametric stationary time series. We show that the weak innovation sequence is Bayesian sufficient, which makes the proposed weak innovation autoencoder a canonical architecture for generative probabilistic forecasting. The proposed technique is applied to forecasting highly volatile real-time electricity prices, demonstrating superior performance across multiple forecasting measures over leading probabilistic and point forecasting techniques.