Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Time": models, code, and papers

Sound Demixing Challenge 2023 Music Demixing Track Technical Report: TFC-TDF-UNet v3

Jun 26, 2023
Minseok Kim, Jun Hyung Lee, Soonyoung Jung

Figure 1 for Sound Demixing Challenge 2023 Music Demixing Track Technical Report: TFC-TDF-UNet v3

Figure 2 for Sound Demixing Challenge 2023 Music Demixing Track Technical Report: TFC-TDF-UNet v3

Figure 3 for Sound Demixing Challenge 2023 Music Demixing Track Technical Report: TFC-TDF-UNet v3

In this report, we present our award-winning solutions for the Music Demixing Track of Sound Demixing Challenge 2023. First, we propose TFC-TDF-UNet v3, a time-efficient music source separation model that achieves state-of-the-art results on the MUSDB benchmark. We then give full details regarding our solutions for each Leaderboard, including a loss masking approach for noise-robust training. Code for reproducing model training and final submissions is available at github.com/kuielab/sdx23.

* 5 pages, 4 tables

Via

Access Paper or Ask Questions

QCNeXt: A Next-Generation Framework For Joint Multi-Agent Trajectory Prediction

Jun 18, 2023
Zikang Zhou, Zihao Wen, Jianping Wang, Yung-Hui Li, Yu-Kai Huang

Figure 1 for QCNeXt: A Next-Generation Framework For Joint Multi-Agent Trajectory Prediction

Figure 2 for QCNeXt: A Next-Generation Framework For Joint Multi-Agent Trajectory Prediction

Figure 3 for QCNeXt: A Next-Generation Framework For Joint Multi-Agent Trajectory Prediction

Figure 4 for QCNeXt: A Next-Generation Framework For Joint Multi-Agent Trajectory Prediction

Estimating the joint distribution of on-road agents' future trajectories is essential for autonomous driving. In this technical report, we propose a next-generation framework for joint multi-agent trajectory prediction called QCNeXt. First, we adopt the query-centric encoding paradigm for the task of joint multi-agent trajectory prediction. Powered by this encoding scheme, our scene encoder is equipped with permutation equivariance on the set elements, roto-translation invariance in the space dimension, and translation invariance in the time dimension. These invariance properties not only enable accurate multi-agent forecasting fundamentally but also empower the encoder with the capability of streaming processing. Second, we propose a multi-agent DETR-like decoder, which facilitates joint multi-agent trajectory prediction by modeling agents' interactions at future time steps. For the first time, we show that a joint prediction model can outperform marginal prediction models even on the marginal metrics, which opens up new research opportunities in trajectory prediction. Our approach ranks 1st on the Argoverse 2 multi-agent motion forecasting benchmark, winning the championship of the Argoverse Challenge at the CVPR 2023 Workshop on Autonomous Driving.

* Technical report for the 1st place solution of the Argoverse 2 Multi-Agent Motion Forecasting Competition at the CVPR 2023 Workshop on Autonomous Driving

Via

Access Paper or Ask Questions

CT-based Subchondral Bone Microstructural Analysis in Knee Osteoarthritis via MR-Guided Distillation Learning

Jul 11, 2023
Yuqi Hu, Xiangyu Zhao, Gaowei Qing, Kai Xie, Chenglei Liu, Lichi Zhang

Figure 1 for CT-based Subchondral Bone Microstructural Analysis in Knee Osteoarthritis via MR-Guided Distillation Learning

Figure 2 for CT-based Subchondral Bone Microstructural Analysis in Knee Osteoarthritis via MR-Guided Distillation Learning

Figure 3 for CT-based Subchondral Bone Microstructural Analysis in Knee Osteoarthritis via MR-Guided Distillation Learning

Figure 4 for CT-based Subchondral Bone Microstructural Analysis in Knee Osteoarthritis via MR-Guided Distillation Learning

Background: MR-based subchondral bone effectively predicts knee osteoarthritis. However, its clinical application is limited by the cost and time of MR. Purpose: We aim to develop a novel distillation-learning-based method named SRRD for subchondral bone microstructural analysis using easily-acquired CT images, which leverages paired MR images to enhance the CT-based analysis model during training. Materials and Methods: Knee joint images of both CT and MR modalities were collected from October 2020 to May 2021. Firstly, we developed a GAN-based generative model to transform MR images into CT images, which was used to establish the anatomical correspondence between the two modalities. Next, we obtained numerous patches of subchondral bone regions of MR images, together with their trabecular parameters (BV / TV, Tb. Th, Tb. Sp, Tb. N) from the corresponding CT image patches via regression. The distillation-learning technique was used to train the regression model and transfer MR structural information to the CT-based model. The regressed trabecular parameters were further used for knee osteoarthritis classification. Results: A total of 80 participants were evaluated. CT-based regression results of trabecular parameters achieved intra-class correlation coefficients (ICCs) of 0.804, 0.773, 0.711, and 0.622 for BV / TV, Tb. Th, Tb. Sp, and Tb. N, respectively. The use of distillation learning significantly improved the performance of the CT-based knee osteoarthritis classification method using the CNN approach, yielding an AUC score of 0.767 (95% CI, 0.681-0.853) instead of 0.658 (95% CI, 0.574-0.742) (p<.001). Conclusions: The proposed SRRD method showed high reliability and validity in MR-CT registration, regression, and knee osteoarthritis classification, indicating the feasibility of subchondral bone microstructural analysis based on CT images.

* 5 figures, 4 tables

Via

Access Paper or Ask Questions

Laxity-Aware Scalable Reinforcement Learning for HVAC Control

Jun 29, 2023
Ruohong Liu, Yuxin Pan, Yize Chen

Figure 1 for Laxity-Aware Scalable Reinforcement Learning for HVAC Control

Figure 2 for Laxity-Aware Scalable Reinforcement Learning for HVAC Control

Figure 3 for Laxity-Aware Scalable Reinforcement Learning for HVAC Control

Figure 4 for Laxity-Aware Scalable Reinforcement Learning for HVAC Control

Demand flexibility plays a vital role in maintaining grid balance, reducing peak demand, and saving customers' energy bills. Given their highly shiftable load and significant contribution to a building's energy consumption, Heating, Ventilation, and Air Conditioning (HVAC) systems can provide valuable demand flexibility to the power systems by adjusting their energy consumption in response to electricity price and power system needs. To exploit this flexibility in both operation time and power, it is imperative to accurately model and aggregate the load flexibility of a large population of HVAC systems as well as designing effective control algorithms. In this paper, we tackle the curse of dimensionality issue in modeling and control by utilizing the concept of laxity to quantify the emergency level of each HVAC operation request. We further propose a two-level approach to address energy optimization for a large population of HVAC systems. The lower level involves an aggregator to aggregate HVAC load laxity information and use least-laxity-first (LLF) rule to allocate real-time power for individual HVAC systems based on the controller's total power. Due to the complex and uncertain nature of HVAC systems, we leverage a reinforcement learning (RL)-based controller to schedule the total power based on the aggregated laxity information and electricity price. We evaluate the temperature control and energy cost saving performance of a large-scale group of HVAC systems in both single-zone and multi-zone scenarios, under varying climate and electricity market conditions. The experiment results indicate that proposed approach outperforms the centralized methods in the majority of test scenarios, and performs comparably to model-based method in some scenarios.

* In Submission

Via

Access Paper or Ask Questions

Assessing the Performance of 1D-Convolution Neural Networks to Predict Concentration of Mixture Components from Raman Spectra

Jun 29, 2023
Dexter Antonio, Hannah O'Toole, Randy Carney, Ambarish Kulkarni, Ahmet Palazoglu

Figure 1 for Assessing the Performance of 1D-Convolution Neural Networks to Predict Concentration of Mixture Components from Raman Spectra

Figure 2 for Assessing the Performance of 1D-Convolution Neural Networks to Predict Concentration of Mixture Components from Raman Spectra

Figure 3 for Assessing the Performance of 1D-Convolution Neural Networks to Predict Concentration of Mixture Components from Raman Spectra

Figure 4 for Assessing the Performance of 1D-Convolution Neural Networks to Predict Concentration of Mixture Components from Raman Spectra

An emerging application of Raman spectroscopy is monitoring the state of chemical reactors during biologic drug production. Raman shift intensities scale linearly with the concentrations of chemical species and thus can be used to analytically determine real-time concentrations using non-destructive light irradiation in a label-free manner. Chemometric algorithms are used to interpret Raman spectra produced from complex mixtures of bioreactor contents as a reaction evolves. Finding the optimal algorithm for a specific bioreactor environment is challenging due to the lack of freely available Raman mixture datasets. The RaMix Python package addresses this challenge by enabling the generation of synthetic Raman mixture datasets with controllable noise levels to assess the utility of different chemometric algorithm types for real-time monitoring applications. To demonstrate the capabilities of this package and compare the performance of different chemometric algorithms, 48 datasets of simulated spectra were generated using the RaMix Python package. The four tested algorithms include partial least squares regression (PLS), a simple neural network, a simple convolutional neural network (simple CNN), and a 1D convolutional neural network with a ResNet architecture (ResNet). The performance of the PLS and simple CNN model was found to be comparable, with the PLS algorithm slightly outperforming the other models on 83\% of the data sets. The simple CNN model outperforms the other models on large, high noise datasets, demonstrating the superior capability of convolutional neural networks compared to PLS in analyzing noisy spectra. These results demonstrate the promise of CNNs to automatically extract concentration information from unprocessed, noisy spectra, allowing for better process control of industrial drug production. Code for this project is available at github.com/DexterAntonio/RaMix.

* 7 pages, 7 figures

Via

Access Paper or Ask Questions

Centralized control for multi-agent RL in a complex Real-Time-Strategy game

Apr 25, 2023
Roger Creus Castanyer

Figure 1 for Centralized control for multi-agent RL in a complex Real-Time-Strategy game

Figure 2 for Centralized control for multi-agent RL in a complex Real-Time-Strategy game

Figure 3 for Centralized control for multi-agent RL in a complex Real-Time-Strategy game

Figure 4 for Centralized control for multi-agent RL in a complex Real-Time-Strategy game

Multi-agent Reinforcement learning (MARL) studies the behaviour of multiple learning agents that coexist in a shared environment. MARL is more challenging than single-agent RL because it involves more complex learning dynamics: the observations and rewards of each agent are functions of all other agents. In the context of MARL, Real-Time Strategy (RTS) games represent very challenging environments where multiple players interact simultaneously and control many units of different natures all at once. In fact, RTS games are so challenging for the current RL methods, that just being able to tackle them with RL is interesting. This project provides the end-to-end experience of applying RL in the Lux AI v2 Kaggle competition, where competitors design agents to control variable-sized fleets of units and tackle a multi-variable optimization, resource gathering, and allocation problem in a 1v1 scenario against other competitors. We use a centralized approach for training the RL agents, and report multiple design decisions along the process. We provide the source code of the project: https://github.com/roger-creus/centralized-control-lux.

Via

Access Paper or Ask Questions

Monitoring of Optical Networks Using Correlation-Aided Time-Domain Reflectometry with Direct and Coherent Detection

Jun 06, 2023
Michael H. Eiselt, Florian Azendorf, Andre Sandmann, Florian Spinty, Mirko Lawin

Figure 1 for Monitoring of Optical Networks Using Correlation-Aided Time-Domain Reflectometry with Direct and Coherent Detection

Figure 2 for Monitoring of Optical Networks Using Correlation-Aided Time-Domain Reflectometry with Direct and Coherent Detection

Figure 3 for Monitoring of Optical Networks Using Correlation-Aided Time-Domain Reflectometry with Direct and Coherent Detection

Figure 4 for Monitoring of Optical Networks Using Correlation-Aided Time-Domain Reflectometry with Direct and Coherent Detection

We report on methods to monitor the transmission path in optical networks using a correlation-based OTDR technique with direct and coherent detection. A high probing symbol rate can provide picosecond-accuracy of the fiber propagation delay, while a sensitive phase detection with a high repetition rate allows the monitoring of dynamic effects in the vicinity of the fiber. We discuss various approaches to evaluate the measured traces and show the results of a few monitoring applications.

* Invited paper to OECC 2023, Shanghai, July 2-6, 2023

Via

Access Paper or Ask Questions

OO-dMVMT: A Deep Multi-view Multi-task Classification Framework for Real-time 3D Hand Gesture Classification and Segmentation

Apr 12, 2023
Federico Cunico, Federico Girella, Andrea Avogaro, Marco Emporio, Andrea Giachetti, Marco Cristani

Figure 1 for OO-dMVMT: A Deep Multi-view Multi-task Classification Framework for Real-time 3D Hand Gesture Classification and Segmentation

Figure 2 for OO-dMVMT: A Deep Multi-view Multi-task Classification Framework for Real-time 3D Hand Gesture Classification and Segmentation

Figure 3 for OO-dMVMT: A Deep Multi-view Multi-task Classification Framework for Real-time 3D Hand Gesture Classification and Segmentation

Figure 4 for OO-dMVMT: A Deep Multi-view Multi-task Classification Framework for Real-time 3D Hand Gesture Classification and Segmentation

Continuous mid-air hand gesture recognition based on captured hand pose streams is fundamental for human-computer interaction, particularly in AR / VR. However, many of the methods proposed to recognize heterogeneous hand gestures are tested only on the classification task, and the real-time low-latency gesture segmentation in a continuous stream is not well addressed in the literature. For this task, we propose the On-Off deep Multi-View Multi-Task paradigm (OO-dMVMT). The idea is to exploit multiple time-local views related to hand pose and movement to generate rich gesture descriptions, along with using heterogeneous tasks to achieve high accuracy. OO-dMVMT extends the classical MVMT paradigm, where all of the multiple tasks have to be active at each time, by allowing specific tasks to switch on/off depending on whether they can apply to the input. We show that OO-dMVMT defines the new SotA on continuous/online 3D skeleton-based gesture recognition in terms of gesture classification accuracy, segmentation accuracy, false positives, and decision latency while maintaining real-time operation.

* Accepted to the Computer Vision for Mixed Reality workshop at CVPR 2023

Via

Access Paper or Ask Questions

iPDP: On Partial Dependence Plots in Dynamic Modeling Scenarios

Jun 13, 2023
Maximilian Muschalik, Fabian Fumagalli, Rohit Jagtani, Barbara Hammer, Eyke Hüllermeier

Figure 1 for iPDP: On Partial Dependence Plots in Dynamic Modeling Scenarios

Figure 2 for iPDP: On Partial Dependence Plots in Dynamic Modeling Scenarios

Figure 3 for iPDP: On Partial Dependence Plots in Dynamic Modeling Scenarios

Figure 4 for iPDP: On Partial Dependence Plots in Dynamic Modeling Scenarios

Post-hoc explanation techniques such as the well-established partial dependence plot (PDP), which investigates feature dependencies, are used in explainable artificial intelligence (XAI) to understand black-box machine learning models. While many real-world applications require dynamic models that constantly adapt over time and react to changes in the underlying distribution, XAI, so far, has primarily considered static learning environments, where models are trained in a batch mode and remain unchanged. We thus propose a novel model-agnostic XAI framework called incremental PDP (iPDP) that extends on the PDP to extract time-dependent feature effects in non-stationary learning environments. We formally analyze iPDP and show that it approximates a time-dependent variant of the PDP that properly reacts to real and virtual concept drift. The time-sensitivity of iPDP is controlled by a single smoothing parameter, which directly corresponds to the variance and the approximation error of iPDP in a static learning environment. We illustrate the efficacy of iPDP by showcasing an example application for drift detection and conducting multiple experiments on real-world and synthetic data sets and streams.

* This preprint has not undergone peer review or any post-submission improvements or corrections

Via

Access Paper or Ask Questions

Dynablox: Real-time Detection of Diverse Dynamic Objects in Complex Environments

Apr 20, 2023
Lukas Schmid, Olov Andersson, Aurelio Sulser, Patrick Pfreundschuh, Roland Siegwart

Figure 1 for Dynablox: Real-time Detection of Diverse Dynamic Objects in Complex Environments

Figure 2 for Dynablox: Real-time Detection of Diverse Dynamic Objects in Complex Environments

Figure 3 for Dynablox: Real-time Detection of Diverse Dynamic Objects in Complex Environments

Figure 4 for Dynablox: Real-time Detection of Diverse Dynamic Objects in Complex Environments

Real-time detection of moving objects is an essential capability for robots acting autonomously in dynamic environments. We thus propose Dynablox, a novel online mapping-based approach for robust moving object detection in complex environments. The central idea of our approach is to incrementally estimate high confidence free-space areas by modeling and accounting for sensing, state estimation, and mapping limitations during online robot operation. The spatio-temporally conservative free space estimate enables robust detection of moving objects without making any assumptions on the appearance of objects or environments. This allows deployment in complex scenes such as multi-storied buildings or staircases, and for diverse moving objects such as people carrying various items, doors swinging or even balls rolling around. We thoroughly evaluate our approach on real-world data sets, achieving 86% IoU at 17 FPS in typical robotic settings. The method outperforms a recent appearance-based classifier and approaches the performance of offline methods. We demonstrate its generality on a novel data set with rare moving objects in complex environments. We make our efficient implementation and the novel data set available as open-source.

* Code released at https://github.com/ethz-asl/dynablox

Via

Access Paper or Ask Questions