We present Any-Modality Augmented Language Model (AnyMAL), a unified model that reasons over diverse input modality signals (i.e. text, image, video, audio, IMU motion sensor), and generates textual responses. AnyMAL inherits the powerful text-based reasoning abilities of the state-of-the-art LLMs including LLaMA-2 (70B), and converts modality-specific signals to the joint textual space through a pre-trained aligner module. To further strengthen the multimodal LLM's capabilities, we fine-tune the model with a multimodal instruction set manually collected to cover diverse topics and tasks beyond simple QAs. We conduct comprehensive empirical analysis comprising both human and automatic evaluations, and demonstrate state-of-the-art performance on various multimodal tasks.
Anomaly detection is important for industrial automation and part quality assurance, and while humans can easily detect anomalies in components given a few examples, designing a generic automated system that can perform at human or above human capabilities remains a challenge. In this work, we present a simple new anomaly detection algorithm called FADS (feature-based anomaly detection system) which leverages pretrained convolutional neural networks (CNN) to generate a statistical model of nominal inputs by observing the activation of the convolutional filters. During inference the system compares the convolutional filter activation of the new input to the statistical model and flags activations that are outside the expected range of values and therefore likely an anomaly. By using a pretrained network, FADS demonstrates excellent performance similar to or better than other machine learning approaches to anomaly detection while at the same time FADS requires no tuning of the CNN weights. We demonstrate FADS ability by detecting process parameter changes on a custom dataset of additively manufactured lattices. The FADS localization algorithm shows that textural differences that are visible on the surface can be used to detect process parameter changes. In addition, we test FADS on benchmark datasets, such as the MVTec Anomaly Detection dataset, and report good results.