Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:magic

A Multimodal Adaptive Graph-based Intelligent Classification Model for Fake News

Nov 18, 2024

Jun-hao, Xu

Figure 1 for A Multimodal Adaptive Graph-based Intelligent Classification Model for Fake News

Figure 2 for A Multimodal Adaptive Graph-based Intelligent Classification Model for Fake News

Figure 3 for A Multimodal Adaptive Graph-based Intelligent Classification Model for Fake News

Figure 4 for A Multimodal Adaptive Graph-based Intelligent Classification Model for Fake News

Abstract:Numerous studies have been proposed to detect fake news focusing on multi-modalities based on machine and/or deep learning. However, studies focusing on graph-based structures using geometric deep learning are lacking. To address this challenge, we introduce the Multimodal Adaptive Graph-based Intelligent Classification (aptly referred to as MAGIC) for fake news detection. Specifically, the Encoder Representations from Transformers was used for text vectorization whilst ResNet50 was used for images. A comprehensive information interaction graph was built using the adaptive Graph Attention Network before classifying the multimodal input through the Softmax function. MAGIC was trained and tested on two fake news datasets, that is, Fakeddit (English) and Multimodal Fake News Detection (Chinese), with the model achieving an accuracy of 98.8\% and 86.3\%, respectively. Ablation experiments also revealed MAGIC to yield superior performance across both the datasets. Findings show that a graph-based deep learning adaptive model is effective in detecting multimodal fake news, surpassing state-of-the-art methods.

* 8 pages

Via

Access Paper or Ask Questions

FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations

Nov 16, 2024

Hmrishav Bandyopadhyay, Yi-Zhe Song

Abstract:Sketch animations offer a powerful medium for visual storytelling, from simple flip-book doodles to professional studio productions. While traditional animation requires teams of skilled artists to draw key frames and in-between frames, existing automation attempts still demand significant artistic effort through precise motion paths or keyframe specification. We present FlipSketch, a system that brings back the magic of flip-book animation -- just draw your idea and describe how you want it to move! Our approach harnesses motion priors from text-to-video diffusion models, adapting them to generate sketch animations through three key innovations: (i) fine-tuning for sketch-style frame generation, (ii) a reference frame mechanism that preserves visual integrity of input sketch through noise refinement, and (iii) a dual-attention composition that enables fluid motion without losing visual consistency. Unlike constrained vector animations, our raster frames support dynamic sketch transformations, capturing the expressive freedom of traditional animation. The result is an intuitive system that makes sketch animation as simple as doodling and describing, while maintaining the artistic essence of hand-drawn animation.

* Code: https://github.com/hmrishavbandy/FlipSketch

Via

Access Paper or Ask Questions

One-Shot Manipulation Strategy Learning by Making Contact Analogies

Nov 14, 2024

Yuyao Liu, Jiayuan Mao, Joshua Tenenbaum, Tomás Lozano-Pérez, Leslie Pack Kaelbling

Figure 1 for One-Shot Manipulation Strategy Learning by Making Contact Analogies

Figure 2 for One-Shot Manipulation Strategy Learning by Making Contact Analogies

Figure 3 for One-Shot Manipulation Strategy Learning by Making Contact Analogies

Figure 4 for One-Shot Manipulation Strategy Learning by Making Contact Analogies

Abstract:We present a novel approach, MAGIC (manipulation analogies for generalizable intelligent contacts), for one-shot learning of manipulation strategies with fast and extensive generalization to novel objects. By leveraging a reference action trajectory, MAGIC effectively identifies similar contact points and sequences of actions on novel objects to replicate a demonstrated strategy, such as using different hooks to retrieve distant objects of different shapes and sizes. Our method is based on a two-stage contact-point matching process that combines global shape matching using pretrained neural features with local curvature analysis to ensure precise and physically plausible contact points. We experiment with three tasks including scooping, hanging, and hooking objects. MAGIC demonstrates superior performance over existing methods, achieving significant improvements in runtime speed and generalization to different object categories. Website: https://magic-2024.github.io/ .

* CoRL LEAP Workshop, 2024

Via

Access Paper or Ask Questions

1st-Order Magic: Analysis of Sharpness-Aware Minimization

Nov 03, 2024

Nalin Tiwary, Siddarth Aananth

Abstract:Sharpness-Aware Minimization (SAM) is an optimization technique designed to improve generalization by favoring flatter loss minima. To achieve this, SAM optimizes a modified objective that penalizes sharpness, using computationally efficient approximations. Interestingly, we find that more precise approximations of the proposed SAM objective degrade generalization performance, suggesting that the generalization benefits of SAM are rooted in these approximations rather than in the original intended mechanism. This highlights a gap in our understanding of SAM's effectiveness and calls for further investigation into the role of approximations in optimization.

* Nalin Tiwary and Siddarth Aananth share equal authorship in this work

Via

Access Paper or Ask Questions

Power Plays: Unleashing Machine Learning Magic in Smart Grids

Oct 20, 2024

Abdur Rashid, Parag Biswas, abdullah al masum, MD Abdullah Al Nasim, Kishor Datta Gupta

Figure 1 for Power Plays: Unleashing Machine Learning Magic in Smart Grids

Figure 2 for Power Plays: Unleashing Machine Learning Magic in Smart Grids

Figure 3 for Power Plays: Unleashing Machine Learning Magic in Smart Grids

Abstract:The integration of machine learning into smart grid systems represents a transformative step in enhancing the efficiency, reliability, and sustainability of modern energy networks. By adding advanced data analytics, these systems can better manage the complexities of renewable energy integration, demand response, and predictive maintenance. Machine learning algorithms analyze vast amounts of data from smart meters, sensors, and other grid components to optimize energy distribution, forecast demand, and detect irregularities that could indicate potential failures. This enables more precise load balancing, reduces operational costs, and enhances the resilience of the grid against disturbances. Furthermore, the use of predictive models helps in anticipating equipment failures, thereby improving the reliability of the energy supply. As smart grids continue to evolve, the role of machine learning in managing decentralized energy sources and enabling real-time decision-making will become increasingly critical. However, the deployment of these technologies also raises challenges related to data privacy, security, and the need for robust infrastructure. Addressing these issues in this research authors will focus on realizing the full potential of smart grids, ensuring they meet the growing energy demands while maintaining a focus on sustainability and efficiency using Machine Learning techniques. Furthermore, this research will help determine the smart grid's essentiality with the aid of Machine Learning. Multiple ML algorithms have been integrated along with their pros and cons. The future scope of these algorithms are also integrated.

* 16 pages, 1 figure

Via

Access Paper or Ask Questions

Efficient Deep Learning Board: Training Feedback Is Not All You Need

Oct 17, 2024

Lina Gong, Qi Gao, Peng Li, Mingqiang Wei, Fei Wu

Abstract:Current automatic deep learning (i.e., AutoDL) frameworks rely on training feedback from actual runs, which often hinder their ability to provide quick and clear performance predictions for selecting suitable DL systems. To address this issue, we propose EfficientDL, an innovative deep learning board designed for automatic performance prediction and component recommendation. EfficientDL can quickly and precisely recommend twenty-seven system components and predict the performance of DL models without requiring any training feedback. The magic of no training feedback comes from our proposed comprehensive, multi-dimensional, fine-grained system component dataset, which enables us to develop a static performance prediction model and comprehensive optimized component recommendation algorithm (i.e., {\alpha}\b{eta}-BO search), removing the dependency on actually running parameterized models during the traditional optimization search process. The simplicity and power of EfficientDL stem from its compatibility with most DL models. For example, EfficientDL operates seamlessly with mainstream models such as ResNet50, MobileNetV3, EfficientNet-B0, MaxViT-T, Swin-B, and DaViT-T, bringing competitive performance improvements. Besides, experimental results on the CIFAR-10 dataset reveal that EfficientDL outperforms existing AutoML tools in both accuracy and efficiency (approximately 20 times faster along with 1.31% Top-1 accuracy improvement than the cutting-edge methods). Source code, pretrained models, and datasets are available at https://github.com/OpenSELab/EfficientDL.

Via

Access Paper or Ask Questions

Learnable Optimization-Based Algorithms for Low-Dose CT Reconstruction

Oct 14, 2024

Daisy Chen

Abstract:Low-dose computed tomography (LDCT) aims to minimize the radiation exposure to patients while maintaining diagnostic image quality. However, traditional CT reconstruction algorithms often struggle with the ill-posed nature of the problem, resulting in severe image artifacts. Recent advances in optimization-based deep learning algorithms offer promising solutions to improve LDCT reconstruction. In this paper, we explore learnable optimization algorithms (LOA) for CT reconstruction, which integrate deep learning within variational models to enhance the regularization process. These methods, including LEARN++ and MAGIC, leverage dual-domain networks that optimize both image and sinogram data, significantly improving reconstruction quality. We also present proximal gradient descent and ADMM-inspired networks, which are efficient and theoretically grounded approaches. Our results demonstrate that these learnable methods outperform traditional techniques, offering enhanced artifact reduction, better detail preservation, and robust performance in clinical scenarios.

Via

Access Paper or Ask Questions

Multimodal 3D Fusion and In-Situ Learning for Spatially Aware AI

Oct 06, 2024

Chengyuan Xu, Radha Kumaran, Noah Stier, Kangyou Yu, Tobias Höllerer

Figure 1 for Multimodal 3D Fusion and In-Situ Learning for Spatially Aware AI

Figure 2 for Multimodal 3D Fusion and In-Situ Learning for Spatially Aware AI

Figure 3 for Multimodal 3D Fusion and In-Situ Learning for Spatially Aware AI

Figure 4 for Multimodal 3D Fusion and In-Situ Learning for Spatially Aware AI

Abstract:Seamless integration of virtual and physical worlds in augmented reality benefits from the system semantically "understanding" the physical environment. AR research has long focused on the potential of context awareness, demonstrating novel capabilities that leverage the semantics in the 3D environment for various object-level interactions. Meanwhile, the computer vision community has made leaps in neural vision-language understanding to enhance environment perception for autonomous tasks. In this work, we introduce a multimodal 3D object representation that unifies both semantic and linguistic knowledge with the geometric representation, enabling user-guided machine learning involving physical objects. We first present a fast multimodal 3D reconstruction pipeline that brings linguistic understanding to AR by fusing CLIP vision-language features into the environment and object models. We then propose "in-situ" machine learning, which, in conjunction with the multimodal representation, enables new tools and interfaces for users to interact with physical spaces and objects in a spatially and linguistically meaningful manner. We demonstrate the usefulness of the proposed system through two real-world AR applications on Magic Leap 2: a) spatial search in physical environments with natural language and b) an intelligent inventory system that tracks object changes over time. We also make our full implementation and demo data available at (https://github.com/cy-xu/spatially_aware_AI) to encourage further exploration and research in spatially aware AI.

* 10 pages, 6 figures, accepted to IEEE ISMAR 2024

Via

Access Paper or Ask Questions

MAGICS: Adversarial RL with Minimax Actors Guided by Implicit Critic Stackelberg for Convergent Neural Synthesis of Robot Safety

Sep 20, 2024

Justin Wang, Haimin Hu, Duy Phuong Nguyen, Jaime Fernández Fisac

Abstract:While robust optimal control theory provides a rigorous framework to compute robot control policies that are provably safe, it struggles to scale to high-dimensional problems, leading to increased use of deep learning for tractable synthesis of robot safety. Unfortunately, existing neural safety synthesis methods often lack convergence guarantees and solution interpretability. In this paper, we present Minimax Actors Guided by Implicit Critic Stackelberg (MAGICS), a novel adversarial reinforcement learning (RL) algorithm that guarantees local convergence to a minimax equilibrium solution. We then build on this approach to provide local convergence guarantees for a general deep RL-based robot safety synthesis algorithm. Through both simulation studies on OpenAI Gym environments and hardware experiments with a 36-dimensional quadruped robot, we show that MAGICS can yield robust control policies outperforming the state-of-the-art neural safety synthesis methods.

* Algorithmic Foundations of Robotics (WAFR) XVI

Via

Access Paper or Ask Questions

Notes on Sampled Gaussian Mechanism

Sep 06, 2024

Nikita P. Kalinin

Abstract:In these notes, we prove a recent conjecture posed in the paper by R\"ais\"a, O. et al. [Subsampling is not Magic: Why Large Batch Sizes Work for Differentially Private Stochastic Optimization (2024)]. Theorem 6.2 of the paper asserts that for the Sampled Gaussian Mechanism - a composition of subsampling and additive Gaussian noise, the effective noise level, $\sigma_{\text{eff}} = \frac{\sigma(q)}{q}$, decreases as a function of the subsampling rate $q$. Consequently, larger subsampling rates are preferred for better privacy-utility trade-offs. Our notes provide a rigorous proof of Conjecture 6.3, which was left unresolved in the original paper, thereby completing the proof of Theorem 6.2.

Via

Access Paper or Ask Questions

Topic:magic

Papers and Code