In this work we present our real-time egocentric body segmentation algorithm. Our algorithm achieves a frame rate of 66 fps for an input resolution of 640x480, thanks to our shallow network inspired in Thundernet's architecture. Besides, we put a strong emphasis on the variability of the training data. More concretely, we describe the creation process of our Egocentric Bodies (EgoBodies) dataset, composed of almost 10,000 images from three datasets, created both from synthetic methods and real capturing. We conduct experiments to understand the contribution of the individual datasets; compare Thundernet model trained with EgoBodies with simpler and more complex previous approaches and discuss their corresponding performance in a real-life setup in terms of segmentation quality and inference times. The described trained semantic segmentation algorithm is already integrated in an end-to-end system for Mixed Reality (MR), making it possible for users to see his/her own body while being immersed in a MR scene.
The National Football League and Amazon Web Services teamed up to develop the best sports injury surveillance and mitigation program via the Kaggle competition. Through which the NFL wants to assign specific players to each helmet, which would help accurately identify each player's "exposures" throughout a football play. We are trying to implement a computer vision based ML algorithms capable of assigning detected helmet impacts to correct players via tracking information. Our paper will explain the approach to automatically track player helmets and their collisions. This will also allow them to review previous plays and explore the trends in exposure over time.
We propose a neural audio generative model, MDCTNet, operating in the perceptually weighted domain of an adaptive modified discrete cosine transform (MDCT). The architecture of the model captures correlations in both time and frequency directions with recurrent layers (RNNs). An audio coding system is obtained by training MDCTNet on a diverse set of fullband monophonic audio signals at 48 kHz sampling, conditioned by a perceptual audio encoder. In a subjective listening test with ten excerpts chosen to be balanced across content types, yet stressful for both codecs, the mean performance of the proposed system for 24 kb/s variable bitrate (VBR) is similar to that of Opus at twice the bitrate.
Over-approximating the reachable sets of dynamical systems is a fundamental problem in safety verification and robust control synthesis. The representation of these sets is a key factor that affects the computational complexity and the approximation error. In this paper, we develop a new approach for over-approximating the reachable sets of neural network dynamical systems using adaptive template polytopes. We use the singular value decomposition of linear layers along with the shape of the activation functions to adapt the geometry of the polytopes at each time step to the geometry of the true reachable sets. We then propose a branch-and-bound method to compute accurate over-approximations of the reachable sets by the inferred templates. We illustrate the utility of the proposed approach in the reachability analysis of linear systems driven by neural network controllers.
In the process of materials discovery, chemists currently need to perform many laborious, time-consuming, and often dangerous lab experiments. To accelerate this process, we propose a framework for robots to assist chemists by performing lab experiments autonomously. The solution allows a general-purpose robot to perform diverse chemistry experiments and efficiently make use of available lab tools. Our system can load high-level descriptions of chemistry experiments, perceive a dynamic workspace, and autonomously plan the required actions and motions to perform the given chemistry experiments with common tools found in the existing lab environment. Our architecture uses a modified PDDLStream solver for integrated task and constrained motion planning, which generates plans and motions that are guaranteed to be safe by preventing collisions and spillage. We present a modular framework that can scale to many different experiments, actions, and lab tools. In this work, we demonstrate the utility of our framework on three pouring skills and two foundational chemical experiments for materials synthesis: solubility and recrystallization. More experiments and updated evaluations can be found at https://ac-rad.github.io/arc-icra2023.
Graphic layout designs play an essential role in visual communication. Yet handcrafting layout designs are skill-demanding, time-consuming, and non-scalable to batch production. Although generative models emerge to make design automation no longer utopian, it remains non-trivial to customize designs that comply with designers' multimodal desires, i.e., constrained by background images and driven by foreground contents. In this study, we propose \textit{LayoutDETR} that inherits the high quality and realism from generative modeling, in the meanwhile reformulating content-aware requirements as a detection problem: we learn to detect in a background image the reasonable locations, scales, and spatial relations for multimodal elements in a layout. Experiments validate that our solution yields new state-of-the-art performance for layout generation on public benchmarks and on our newly-curated ads banner dataset. For practical usage, we build our solution into a graphical system that facilitates user studies. We demonstrate that our designs attract more subjective preference than baselines by significant margins. Our code, models, dataset, graphical system, and demos are available at https://github.com/salesforce/LayoutDETR.
Cooperative relays improve reliability and coverage in wireless networks by providing multiple paths for data transmission. Relaying will play an essential role in vehicular networks at higher frequency bands, where mobility and frequent signal blockages cause link outages. To ensure connectivity in a relay-aided vehicular network, the relay selection policy should be designed to efficiently find unblocked relays. Inspired by recent advances in beam management in mobile millimeter wave (mmWave) networks, this paper address the question: how can the best relay be selected with minimal overhead from beam management? In this regard, we formulate a sequential decision problem to jointly optimize relay selection and beam management. We propose a joint relay selection and beam management policy based on deep reinforcement learning (DRL) using the Markov property of beam indices and beam measurements. The proposed DRL-based algorithm learns time-varying thresholds that adapt to the dynamic channel conditions and traffic patterns. Numerical experiments demonstrate that the proposed algorithm outperforms baselines without prior channel knowledge. Moreover, the DRL-based algorithm can maintain high spectral efficiency under fast-varying channels.
During the COVID-19 pandemic, the Church closed its physical doors for the first time in about 800 years, which is, arguably, a cataclysmic event. Other religions have found themselves in a similar situation, and they were practically forced to move online, which is an unprecedented occasion. In this paper, we analyse this sudden change in religious activities twofold: we create and deliver a questionnaire, as well as analyse Twitter data, to understand people's perceptions and activities related to religious activities online. Importantly, we also analyse the temporal variations in this process by analysing a period of 3 months: July-September 2020. Additionally to the separate analysis of the two data sources, we also discuss the implications from triangulating the results.
Pneumonia, a respiratory infection brought on by bacteria or viruses, affects a large number of people, especially in developing and impoverished countries where high levels of pollution, unclean living conditions, and overcrowding are frequently observed, along with insufficient medical infrastructure. Pleural effusion, a condition in which fluids fill the lung and complicate breathing, is brought on by pneumonia. Early detection of pneumonia is essential for ensuring curative care and boosting survival rates. The approach most usually used to diagnose pneumonia is chest X-ray imaging. The purpose of this work is to develop a method for the automatic diagnosis of bacterial and viral pneumonia in digital x-ray pictures. This article first presents the authors' technique, and then gives a comprehensive report on recent developments in the field of reliable diagnosis of pneumonia. In this study, here tuned a state-of-the-art deep convolutional neural network to classify plant diseases based on images and tested its performance. Deep learning architecture is compared empirically. VGG19, ResNet with 152v2, Resnext101, Seresnet152, Mobilenettv2, and DenseNet with 201 layers are among the architectures tested. Experiment data consists of two groups, sick and healthy X-ray pictures. To take appropriate action against plant diseases as soon as possible, rapid disease identification models are preferred. DenseNet201 has shown no overfitting or performance degradation in our experiments, and its accuracy tends to increase as the number of epochs increases. Further, DenseNet201 achieves state-of-the-art performance with a significantly a smaller number of parameters and within a reasonable computing time. This architecture outperforms the competition in terms of testing accuracy, scoring 95%. Each architecture was trained using Keras, using Theano as the backend.
This work proposes a novel singularity avoidance approach for real-time trajectory optimization based on known singular configurations. The focus of this work lies on analyzing kinematically singular configurations for three robots with different kinematic structures, i.e., the Comau Racer 7-1.4, the KUKA LBR iiwa R820, and the Franka Emika Panda, and exploiting these configurations in form of tailored potential functions for singularity avoidance. Monte Carlo simulations of the proposed method and the commonly used manipulability maximization approach are performed for comparison. The numerical results show that the average computing time can be reduced and shorter trajectories in both time and path length are obtained with the proposed approach