Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Time": models, code, and papers

Test-time Batch Normalization

May 20, 2022
Tao Yang, Shenglong Zhou, Yuwang Wang, Yan Lu, Nanning Zheng

Figure 1 for Test-time Batch Normalization

Figure 2 for Test-time Batch Normalization

Figure 3 for Test-time Batch Normalization

Figure 4 for Test-time Batch Normalization

Deep neural networks often suffer the data distribution shift between training and testing, and the batch statistics are observed to reflect the shift. In this paper, targeting of alleviating distribution shift in test time, we revisit the batch normalization (BN) in the training process and reveals two key insights benefiting test-time optimization: $(i)$ preserving the same gradient backpropagation form as training, and $(ii)$ using dataset-level statistics for robust optimization and inference. Based on the two insights, we propose a novel test-time BN layer design, GpreBN, which is optimized during testing by minimizing Entropy loss. We verify the effectiveness of our method on two typical settings with distribution shift, i.e., domain generalization and robustness tasks. Our GpreBN significantly improves the test-time performance and achieves the state of the art results.

Via

Access Paper or Ask Questions

Boosted ab initio Cryo-EM 3D Reconstruction with ACE-EM

Feb 14, 2023
Lin Yao, Ruihan Xu, Zhifeng Gao, Guolin Ke, Yuhang Wang

Figure 1 for Boosted ab initio Cryo-EM 3D Reconstruction with ACE-EM

Figure 2 for Boosted ab initio Cryo-EM 3D Reconstruction with ACE-EM

Figure 3 for Boosted ab initio Cryo-EM 3D Reconstruction with ACE-EM

Figure 4 for Boosted ab initio Cryo-EM 3D Reconstruction with ACE-EM

The central problem in cryo-electron microscopy (cryo-EM) is to recover the 3D structure from noisy 2D projection images which requires estimating the missing projection angles (poses). Recent methods attempted to solve the 3D reconstruction problem with the autoencoder architecture, which suffers from the latent vector space sampling problem and frequently produces suboptimal pose inferences and inferior 3D reconstructions. Here we present an improved autoencoder architecture called ACE (Asymmetric Complementary autoEncoder), based on which we designed the ACE-EM method for cryo-EM 3D reconstructions. Compared to previous methods, ACE-EM reached higher pose space coverage within the same training time and boosted the reconstruction performance regardless of the choice of decoders. With this method, the Nyquist resolution (highest possible resolution) was reached for 3D reconstructions of both simulated and experimental cryo-EM datasets. Furthermore, ACE-EM is the only amortized inference method that reached the Nyquist resolution.

Via

Access Paper or Ask Questions

From paintbrush to pixel: A review of deep neural networks in AI-generated art

Feb 14, 2023
Anne-Sofie Maerten, Derya Soydaner

Figure 1 for From paintbrush to pixel: A review of deep neural networks in AI-generated art

Figure 2 for From paintbrush to pixel: A review of deep neural networks in AI-generated art

Figure 3 for From paintbrush to pixel: A review of deep neural networks in AI-generated art

Figure 4 for From paintbrush to pixel: A review of deep neural networks in AI-generated art

This paper delves into the fascinating field of AI-generated art and explores the various deep neural network architectures and models that have been utilized to create it. From the classic convolutional networks to the cutting-edge diffusion models, we examine the key players in the field. We explain the general structures and working principles of these neural networks. Then, we showcase examples of milestones, starting with the dreamy landscapes of DeepDream and moving on to the most recent developments, including Stable Diffusion and DALL-E 2, which produce mesmerizing images. A detailed comparison of these models is provided, highlighting their strengths and limitations. Thus, we examine the remarkable progress that deep neural networks have made so far in a short period of time. With a unique blend of technical explanations and insights into the current state of AI-generated art, this paper exemplifies how art and computer science interact.

Via

Access Paper or Ask Questions

Data-Centric Governance

Feb 14, 2023
Sean McGregor, Jesse Hostetler

Artificial intelligence (AI) governance is the body of standards and practices used to ensure that AI systems are deployed responsibly. Current AI governance approaches consist mainly of manual review and documentation processes. While such reviews are necessary for many systems, they are not sufficient to systematically address all potential harms, as they do not operationalize governance requirements for system engineering, behavior, and outcomes in a way that facilitates rigorous and reproducible evaluation. Modern AI systems are data-centric: they act on data, produce data, and are built through data engineering. The assurance of governance requirements must also be carried out in terms of data. This work explores the systematization of governance requirements via datasets and algorithmic evaluations. When applied throughout the product lifecycle, data-centric governance decreases time to deployment, increases solution quality, decreases deployment risks, and places the system in a continuous state of assured compliance with governance requirements.

* 26 pages, 13 figures

Via

Access Paper or Ask Questions

Fixed-time Integral Sliding Mode Control for Admittance Control of a Robot Manipulator

Aug 09, 2022
Yuzhu Sun, Mien Van, Stephen McIlvanna, Sean McLoone, Dariusz Ceglarek

Figure 1 for Fixed-time Integral Sliding Mode Control for Admittance Control of a Robot Manipulator

Figure 2 for Fixed-time Integral Sliding Mode Control for Admittance Control of a Robot Manipulator

Figure 3 for Fixed-time Integral Sliding Mode Control for Admittance Control of a Robot Manipulator

Figure 4 for Fixed-time Integral Sliding Mode Control for Admittance Control of a Robot Manipulator

This paper proposes a novel fixed-time integral sliding mode controller for admittance control to enhance physical human-robot collaboration. The proposed method combines the benefits of compliance to external forces of admittance control and high robustness to uncertainties of integral sliding mode control (ISMC), such that the system can collaborate with a human partner in an uncertain environment effectively. Firstly, a fixed-time sliding surface is applied in the ISMC to make the tracking error of the system converge within a fixed-time regardless of the initial condition. Then, a fixed-time backstepping controller (BSP) is integrated into the ISMC as the nominal controller to realize global fixed-time convergence. Furthermore, to overcome the singularity problem, a non-singular fixed-time sliding surface is designed and integrated into the controller, which is useful for practical application. Finally, the proposed controller is validated for a two-link robot manipulator with uncertainties and external human forces. The results show that the proposed controller is superior in the sense of both tracking error and convergence time, and at the same time, can comply with human motion in a shared workspace.

Via

Access Paper or Ask Questions

Towards real-time 6D pose estimation of objects in single-view cone-beam X-ray

Nov 06, 2022
Christiaan G. A. Viviers, Joel de Bruijn, Lena Filatova, Peter H. N. de With, Fons van der Sommen

Deep learning-based pose estimation algorithms can successfully estimate the pose of objects in an image, especially in the field of color images. 6D Object pose estimation based on deep learning models for X-ray images often use custom architectures that employ extensive CAD models and simulated data for training purposes. Recent RGB-based methods opt to solve pose estimation problems using small datasets, making them more attractive for the X-ray domain where medical data is scarcely available. We refine an existing RGB-based model (SingleShotPose) to estimate the 6D pose of a marked cube from grayscale X-ray images by creating a generic solution trained on only real X-ray data and adjusted for X-ray acquisition geometry. The model regresses 2D control points and calculates the pose through 2D/3D correspondences using Perspective-n-Point(PnP), allowing a single trained model to be used across all supporting cone-beam-based X-ray geometries. Since modern X-ray systems continuously adjust acquisition parameters during a procedure, it is essential for such a pose estimation network to consider these parameters in order to be deployed successfully and find a real use case. With a 5-cm/5-degree accuracy of 93% and an average 3D rotation error of 2.2 degrees, the results of the proposed approach are comparable with state-of-the-art alternatives, while requiring significantly less real training examples and being applicable in real-time applications.

* Published at SPIE Medical Imaging 2022

Via

Access Paper or Ask Questions

TFN: An Interpretable Neural Network with Time-Frequency Transform Embedded for Intelligent Fault Diagnosis

Sep 05, 2022
Qian Chen, Xingjian Dong, Guowei Tu, Dong Wang, Baoxuan Zhao, Zhike Peng

Figure 1 for TFN: An Interpretable Neural Network with Time-Frequency Transform Embedded for Intelligent Fault Diagnosis

Figure 2 for TFN: An Interpretable Neural Network with Time-Frequency Transform Embedded for Intelligent Fault Diagnosis

Figure 3 for TFN: An Interpretable Neural Network with Time-Frequency Transform Embedded for Intelligent Fault Diagnosis

Figure 4 for TFN: An Interpretable Neural Network with Time-Frequency Transform Embedded for Intelligent Fault Diagnosis

Convolutional Neural Networks (CNNs) are widely used in fault diagnosis of mechanical systems due to their powerful feature extraction and classification capabilities. However, the CNN is a typical black-box model, and the mechanism of CNN's decision-making are not clear, which limits its application in high-reliability-required fault diagnosis scenarios. To tackle this issue, we propose a novel interpretable neural network termed as Time-Frequency Network (TFN), where the physically meaningful time-frequency transform (TFT) method is embedded into the traditional convolutional layer as an adaptive preprocessing layer. This preprocessing layer named as time-frequency convolutional (TFconv) layer, is constrained by a well-designed kernel function to extract fault-related time-frequency information. It not only improves the diagnostic performance but also reveals the logical foundation of the CNN prediction in the frequency domain. Different TFT methods correspond to different kernel functions of the TFconv layer. In this study, four typical TFT methods are considered to formulate the TFNs and their effectiveness and interpretability are proved through three mechanical fault diagnosis experiments. Experimental results also show that the proposed TFconv layer can be easily generalized to other CNNs with different depths. The code of TFN is available on https://github.com/ChenQian0618/TFN.

* 20 pages, 15 figures, 5 tables

Via

Access Paper or Ask Questions

Embodied Footprints: A Safety-guaranteed Collision Avoidance Model for Numerical Optimization-based Trajectory Planning

Feb 15, 2023
Bai Li, Youmin Zhang, Tantan Zhang, Tankut Acarman, Yakun Ouyang, Li Li, Hairong Dong, Dongpu Cao

Figure 1 for Embodied Footprints: A Safety-guaranteed Collision Avoidance Model for Numerical Optimization-based Trajectory Planning

Figure 2 for Embodied Footprints: A Safety-guaranteed Collision Avoidance Model for Numerical Optimization-based Trajectory Planning

Figure 3 for Embodied Footprints: A Safety-guaranteed Collision Avoidance Model for Numerical Optimization-based Trajectory Planning

Figure 4 for Embodied Footprints: A Safety-guaranteed Collision Avoidance Model for Numerical Optimization-based Trajectory Planning

Numerical optimization-based methods are among the prevalent trajectory planners for autonomous driving. In a numerical optimization-based planner, the nominal continuous-time trajectory planning problem is discretized into a nonlinear program (NLP) problem with finite constraints imposed on finite collocation points. However, constraint violations between adjacent collocation points may still occur. This study proposes a safety-guaranteed collision-avoidance modeling method to eliminate the collision risks between adjacent collocation points in using numerical optimization-based trajectory planners. A new concept called embodied box is proposed, which is formed by enlarging the rectangular footprint of the ego vehicle. If one can ensure that the embodied boxes at finite collocation points are collide-free, then the ego vehicle's footprint is collide-free at any a moment between adjacent collocation points. We find that the geometric size of an embodied box is a simple function of vehicle velocity and curvature. The proposed theory lays a foundation for numerical optimization-based trajectory planners in autonomous driving.

* 12 pages, 13 figures

Via

Access Paper or Ask Questions

Self-supervised Registration and Segmentation of the Ossicles with A Single Ground Truth Label

Feb 15, 2023
Yike Zhang, Jack Noble

Figure 1 for Self-supervised Registration and Segmentation of the Ossicles with A Single Ground Truth Label

Figure 2 for Self-supervised Registration and Segmentation of the Ossicles with A Single Ground Truth Label

Figure 3 for Self-supervised Registration and Segmentation of the Ossicles with A Single Ground Truth Label

Figure 4 for Self-supervised Registration and Segmentation of the Ossicles with A Single Ground Truth Label

AI-assisted surgeries have drawn the attention of the medical image research community due to their real-world impact on improving surgery success rates. For image-guided surgeries, such as Cochlear Implants (CIs), accurate object segmentation can provide useful information for surgeons before an operation. Recently published image segmentation methods that leverage machine learning usually rely on a large number of manually predefined ground truth labels. However, it is a laborious and time-consuming task to prepare the dataset. This paper presents a novel technique using a self-supervised 3D-UNet that produces a dense deformation field between an atlas and a target image that can be used for atlas-based segmentation of the ossicles. Our results show that our method outperforms traditional image segmentation methods and generates a more accurate boundary around the ossicles based on Dice similarity coefficient and point-to-point error comparison. The mean Dice coefficient is improved by 8.51% with our proposed method.

* conference

Via

Access Paper or Ask Questions

Ultrafast single-channel machine vision based on neuro-inspired photonic computing

Feb 15, 2023
Tomoya Yamaguchi, Kohei Arai, Tomoaki Niiyama, Atsushi Uchida, Satoshi Sunada

Figure 1 for Ultrafast single-channel machine vision based on neuro-inspired photonic computing

Figure 2 for Ultrafast single-channel machine vision based on neuro-inspired photonic computing

Figure 3 for Ultrafast single-channel machine vision based on neuro-inspired photonic computing

Figure 4 for Ultrafast single-channel machine vision based on neuro-inspired photonic computing

High-speed machine vision is increasing its importance in both scientific and technological applications. Neuro-inspired photonic computing is a promising approach to speed-up machine vision processing with ultralow latency. However, the processing rate is fundamentally limited by the low frame rate of image sensors, typically operating at tens of hertz. Here, we propose an image-sensor-free machine vision framework, which optically processes real-world visual information with only a single input channel, based on a random temporal encoding technique. This approach allows for compressive acquisitions of visual information with a single channel at gigahertz rates, outperforming conventional approaches, and enables its direct photonic processing using a photonic reservoir computer in a time domain. We experimentally demonstrate that the proposed approach is capable of high-speed image recognition and anomaly detection, and furthermore, it can be used for high-speed imaging. The proposed approach is multipurpose and can be extended for a wide range of applications, including tracking, controlling, and capturing sub-nanosecond phenomena.

* 30 pages, 12 figures

Via

Access Paper or Ask Questions