Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Time": models, code, and papers

Efficient Quantization-aware Training with Adaptive Coreset Selection

Jun 12, 2023
Xijie Huang, Zechun Liu, Shih-Yang Liu, Kwang-Ting Cheng

Figure 1 for Efficient Quantization-aware Training with Adaptive Coreset Selection

Figure 2 for Efficient Quantization-aware Training with Adaptive Coreset Selection

Figure 3 for Efficient Quantization-aware Training with Adaptive Coreset Selection

Figure 4 for Efficient Quantization-aware Training with Adaptive Coreset Selection

The expanding model size and computation of deep neural networks (DNNs) have increased the demand for efficient model deployment methods. Quantization-aware training (QAT) is a representative model compression method to leverage redundancy in weights and activations. However, most existing QAT methods require end-to-end training on the entire dataset, which suffers from long training time and high energy costs. Coreset selection, aiming to improve data efficiency utilizing the redundancy of training data, has also been widely used for efficient training. In this work, we propose a new angle through the coreset selection to improve the training efficiency of quantization-aware training. Based on the characteristics of QAT, we propose two metrics: error vector score and disagreement score, to quantify the importance of each sample during training. Guided by these two metrics of importance, we proposed a quantization-aware adaptive coreset selection (ACS) method to select the data for the current training epoch. We evaluate our method on various networks (ResNet-18, MobileNetV2), datasets(CIFAR-100, ImageNet-1K), and under different quantization settings. Compared with previous coreset selection methods, our method significantly improves QAT performance with different dataset fractions. Our method can achieve an accuracy of 68.39% of 4-bit quantized ResNet-18 on the ImageNet-1K dataset with only a 10% subset, which has an absolute gain of 4.24% compared to the baseline.

* Code: https://github.com/HuangOwen/QAT-ACS

Via

Access Paper or Ask Questions

Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow

Jun 12, 2023
Wenqi Zhang, Yongliang Shen, Weiming Lu, Yueting Zhuang

Figure 1 for Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow

Figure 2 for Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow

Figure 3 for Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow

Figure 4 for Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow

Various industries such as finance, meteorology, and energy generate vast amounts of heterogeneous data every day. There is a natural demand for humans to manage, process, and display data efficiently. However, it necessitates labor-intensive efforts and a high level of expertise for these data-related tasks. Considering that large language models (LLMs) have showcased promising capabilities in semantic understanding and reasoning, we advocate that the deployment of LLMs could autonomously manage and process massive amounts of data while displaying and interacting in a human-friendly manner. Based on this belief, we propose Data-Copilot, an LLM-based system that connects numerous data sources on one end and caters to diverse human demands on the other end. Acting like an experienced expert, Data-Copilot autonomously transforms raw data into visualization results that best match the user's intent. Specifically, Data-Copilot autonomously designs versatile interfaces (tools) for data management, processing, prediction, and visualization. In real-time response, it automatically deploys a concise workflow by invoking corresponding interfaces step by step for the user's request. The interface design and deployment processes are fully controlled by Data-Copilot itself, without human assistance. Besides, we create a Data-Copilot demo that links abundant data from different domains (stock, fund, company, economics, and live news) and accurately respond to diverse requests, serving as a reliable AI assistant.

Via

Access Paper or Ask Questions

Occlusion-Aware Path Planning for Collision Avoidance: Leveraging Potential Field Method with Responsibility-Sensitive Safety

Jun 12, 2023
Pengfei Lin, Ehsan Javanmardi, Jin Nakazato, Manabu Tsukada

Figure 1 for Occlusion-Aware Path Planning for Collision Avoidance: Leveraging Potential Field Method with Responsibility-Sensitive Safety

Figure 2 for Occlusion-Aware Path Planning for Collision Avoidance: Leveraging Potential Field Method with Responsibility-Sensitive Safety

Figure 3 for Occlusion-Aware Path Planning for Collision Avoidance: Leveraging Potential Field Method with Responsibility-Sensitive Safety

Figure 4 for Occlusion-Aware Path Planning for Collision Avoidance: Leveraging Potential Field Method with Responsibility-Sensitive Safety

Collision avoidance (CA) has always been the foremost task for autonomous vehicles (AVs) under safety criteria. And path planning is directly responsible for generating a safe path to accomplish CA while satisfying other commands. Due to the real-time computation and simple structure, the potential field (PF) has emerged as one of the mainstream path-planning algorithms. However, the current PF is primarily simulated in ideal CA scenarios, assuming complete obstacle information while disregarding occlusion issues where obstacles can be partially or entirely hidden from the AV's sensors. During the occlusion period, the occluded obstacles do not possess a PF. Once the occlusion is over, these obstacles can generate an instantaneous virtual force that impacts the ego vehicle. Therefore, we propose an occlusion-aware path planning (OAPP) with the responsibility-sensitive safety (RSS)-based PF to tackle the occlusion problem for non-connected AVs. We first categorize the detected and occluded obstacles, and then we proceed to the RSS violation check. Finally, we can generate different virtual forces from the PF for occluded and non-occluded obstacles. We compare the proposed OAPP method with other PF-based path planning methods via MATLAB/Simulink. The simulation results indicate that the proposed method can eliminate instantaneous lateral oscillation or sway and produce a smoother path than conventional PF methods.

* Submitted to IEEE ITSC 2023

Via

Access Paper or Ask Questions

Inflated 3D Convolution-Transformer for Weakly-supervised Carotid Stenosis Grading with Ultrasound Videos

Jun 12, 2023
Xinrui Zhou, Yuhao Huang, Wufeng Xue, Xin Yang, Yuxin Zou, Qilong Ying, Yuanji Zhang, Jia Liu, Jie Ren, Dong Ni

Figure 1 for Inflated 3D Convolution-Transformer for Weakly-supervised Carotid Stenosis Grading with Ultrasound Videos

Figure 2 for Inflated 3D Convolution-Transformer for Weakly-supervised Carotid Stenosis Grading with Ultrasound Videos

Figure 3 for Inflated 3D Convolution-Transformer for Weakly-supervised Carotid Stenosis Grading with Ultrasound Videos

Figure 4 for Inflated 3D Convolution-Transformer for Weakly-supervised Carotid Stenosis Grading with Ultrasound Videos

Localization of the narrowest position of the vessel and corresponding vessel and remnant vessel delineation in carotid ultrasound (US) are essential for carotid stenosis grading (CSG) in clinical practice. However, the pipeline is time-consuming and tough due to the ambiguous boundaries of plaque and temporal variation. To automatize this procedure, a large number of manual delineations are usually required, which is not only laborious but also not reliable given the annotation difficulty. In this study, we present the first video classification framework for automatic CSG. Our contribution is three-fold. First, to avoid the requirement of laborious and unreliable annotation, we propose a novel and effective video classification network for weakly-supervised CSG. Second, to ease the model training, we adopt an inflation strategy for the network, where pre-trained 2D convolution weights can be adapted into the 3D counterpart in our network for an effective warm start. Third, to enhance the feature discrimination of the video, we propose a novel attention-guided multi-dimension fusion (AMDF) transformer encoder to model and integrate global dependencies within and across spatial and temporal dimensions, where two lightweight cross-dimensional attention mechanisms are designed. Our approach is extensively validated on a large clinically collected carotid US video dataset, demonstrating state-of-the-art performance compared with strong competitors.

* Accepted by MICCAI 2023

Via

Access Paper or Ask Questions

TrojText: Test-time Invisible Textual Trojan Insertion

Mar 03, 2023
Yepeng Liu, Bo Feng, Qian Lou

Figure 1 for TrojText: Test-time Invisible Textual Trojan Insertion

Figure 2 for TrojText: Test-time Invisible Textual Trojan Insertion

Figure 3 for TrojText: Test-time Invisible Textual Trojan Insertion

Figure 4 for TrojText: Test-time Invisible Textual Trojan Insertion

In Natural Language Processing (NLP), intelligent neuron models can be susceptible to textual Trojan attacks. Such attacks occur when Trojan models behave normally for standard inputs but generate malicious output for inputs that contain a specific trigger. Syntactic-structure triggers, which are invisible, are becoming more popular for Trojan attacks because they are difficult to detect and defend against. However, these types of attacks require a large corpus of training data to generate poisoned samples with the necessary syntactic structures for Trojan insertion. Obtaining such data can be difficult for attackers, and the process of generating syntactic poisoned triggers and inserting Trojans can be time-consuming. This paper proposes a solution called TrojText, which aims to determine whether invisible textual Trojan attacks can be performed more efficiently and cost-effectively without training data. The proposed approach, called the Representation-Logit Trojan Insertion (RLI) algorithm, uses smaller sampled test data instead of large training data to achieve the desired attack. The paper also introduces two additional techniques, namely the accumulated gradient ranking (AGR) and Trojan Weights Pruning (TWP), to reduce the number of tuned parameters and the attack overhead. The TrojText approach was evaluated on three datasets (AG's News, SST-2, and OLID) using three NLP models (BERT, XLNet, and DeBERTa). The experiments demonstrated that the TrojText approach achieved a 98.35\% classification accuracy for test sentences in the target class on the BERT model for the AG's News dataset. The source code for TrojText is available at https://github.com/UCF-ML-Research/TrojText.

* ICLR 2023 Camera Ready

Via

Access Paper or Ask Questions

Ti-MAE: Self-Supervised Masked Time Series Autoencoders

Jan 21, 2023
Zhe Li, Zhongwen Rao, Lujia Pan, Pengyun Wang, Zenglin Xu

Figure 1 for Ti-MAE: Self-Supervised Masked Time Series Autoencoders

Figure 2 for Ti-MAE: Self-Supervised Masked Time Series Autoencoders

Figure 3 for Ti-MAE: Self-Supervised Masked Time Series Autoencoders

Figure 4 for Ti-MAE: Self-Supervised Masked Time Series Autoencoders

Multivariate Time Series forecasting has been an increasingly popular topic in various applications and scenarios. Recently, contrastive learning and Transformer-based models have achieved good performance in many long-term series forecasting tasks. However, there are still several issues in existing methods. First, the training paradigm of contrastive learning and downstream prediction tasks are inconsistent, leading to inaccurate prediction results. Second, existing Transformer-based models which resort to similar patterns in historical time series data for predicting future values generally induce severe distribution shift problems, and do not fully leverage the sequence information compared to self-supervised methods. To address these issues, we propose a novel framework named Ti-MAE, in which the input time series are assumed to follow an integrate distribution. In detail, Ti-MAE randomly masks out embedded time series data and learns an autoencoder to reconstruct them at the point-level. Ti-MAE adopts mask modeling (rather than contrastive learning) as the auxiliary task and bridges the connection between existing representation learning and generative Transformer-based methods, reducing the difference between upstream and downstream forecasting tasks while maintaining the utilization of original time series data. Experiments on several public real-world datasets demonstrate that our framework of masked autoencoding could learn strong representations directly from the raw data, yielding better performance in time series forecasting and classification tasks.

* 20 pages, 7 figures

Via

Access Paper or Ask Questions

High-Throughput AI Inference for Medical Image Classification and Segmentation using Intelligent Streaming

May 24, 2023
Pranav Kulkarni, Sean Garin, Adway Kanhere, Eliot Siegel, Paul H. Yi, Vishwa S. Parekh

Figure 1 for High-Throughput AI Inference for Medical Image Classification and Segmentation using Intelligent Streaming

Figure 2 for High-Throughput AI Inference for Medical Image Classification and Segmentation using Intelligent Streaming

Figure 3 for High-Throughput AI Inference for Medical Image Classification and Segmentation using Intelligent Streaming

Figure 4 for High-Throughput AI Inference for Medical Image Classification and Segmentation using Intelligent Streaming

As the adoption of AI systems within the clinical setup grows, limitations in bandwidth could create communication bottlenecks when streaming imaging data, leading to delays in patient diagnosis and treatment. As such, healthcare providers and AI vendors will require greater computational infrastructure, therefore dramatically increasing costs. To that end, we developed intelligent streaming, a state-of-the-art framework to enable accelerated, cost-effective, bandwidth-optimized, and computationally efficient AI inference for clinical decision making at scale. For classification, intelligent streaming reduced the data transmission by 99.01% and decoding time by 98.58%, while increasing throughput by 27.43x. For segmentation, our framework reduced data transmission by 90.32%, decoding time by 90.26%, while increasing throughput by 4.20x. Our work demonstrates that intelligent streaming results in faster turnaround times, and reduced overall cost of data and transmission, without negatively impacting clinical decision making using AI systems.

* 10 pages, 2 figures, 4 tables, patent pending

Via

Access Paper or Ask Questions

Neural Lyapunov and Optimal Control

May 24, 2023
Daniel Layeghi, Steve Tonneau, Michael Mistry

Figure 1 for Neural Lyapunov and Optimal Control

Figure 2 for Neural Lyapunov and Optimal Control

Figure 3 for Neural Lyapunov and Optimal Control

Figure 4 for Neural Lyapunov and Optimal Control

Optimal control (OC) is an effective approach to controlling complex dynamical systems. However, traditional approaches to parameterising and learning controllers in optimal control have been ad-hoc, collecting data and fitting it to neural networks. However, this can lead to learnt controllers ignoring constraints like optimality and time variability. We introduce a unified framework that simultaneously solves control problems while learning corresponding Lyapunov or value functions. Our method formulates OC-like mathematical programs based on the Hamilton-Jacobi-Bellman (HJB) equation. We leverage the HJB optimality constraint and its relaxation to learn time-varying value and Lyapunov functions, implicitly ensuring the inclusion of constraints. We show the effectiveness of our approach on linear and nonlinear control-affine problems. Additionally, we demonstrate significant reductions in planning horizons (up to a factor of 25) when incorporating the learnt functions into Model Predictive Controllers.

Via

Access Paper or Ask Questions

Black-Box vs. Gray-Box: A Case Study on Learning Table Tennis Ball Trajectory Prediction with Spin and Impacts

May 24, 2023
Jan Achterhold, Philip Tobuschat, Hao Ma, Dieter Buechler, Michael Muehlebach, Joerg Stueckler

Figure 1 for Black-Box vs. Gray-Box: A Case Study on Learning Table Tennis Ball Trajectory Prediction with Spin and Impacts

Figure 2 for Black-Box vs. Gray-Box: A Case Study on Learning Table Tennis Ball Trajectory Prediction with Spin and Impacts

Figure 3 for Black-Box vs. Gray-Box: A Case Study on Learning Table Tennis Ball Trajectory Prediction with Spin and Impacts

Figure 4 for Black-Box vs. Gray-Box: A Case Study on Learning Table Tennis Ball Trajectory Prediction with Spin and Impacts

In this paper, we present a method for table tennis ball trajectory filtering and prediction. Our gray-box approach builds on a physical model. At the same time, we use data to learn parameters of the dynamics model, of an extended Kalman filter, and of a neural model that infers the ball's initial condition. We demonstrate superior prediction performance of our approach over two black-box approaches, which are not supplied with physical prior knowledge. We demonstrate that initializing the spin from parameters of the ball launcher using a neural network drastically improves long-time prediction performance over estimating the spin purely from measured ball positions. An accurate prediction of the ball trajectory is crucial for successful returns. We therefore evaluate the return performance with a pneumatic artificial muscular robot and achieve a return rate of 29/30 (97.7%).

* Accepted for publication at the 5th Annual Conference on Learning for Dynamics and Control (L4DC) 2023. With supplementary material

Via

Access Paper or Ask Questions

Temporal Dynamic Quantization for Diffusion Models

Jun 04, 2023
Junhyuk So, Jungwon Lee, Daehyun Ahn, Hyungjun Kim, Eunhyeok Park

Figure 1 for Temporal Dynamic Quantization for Diffusion Models

Figure 2 for Temporal Dynamic Quantization for Diffusion Models

Figure 3 for Temporal Dynamic Quantization for Diffusion Models

Figure 4 for Temporal Dynamic Quantization for Diffusion Models

The diffusion model has gained popularity in vision applications due to its remarkable generative performance and versatility. However, high storage and computation demands, resulting from the model size and iterative generation, hinder its use on mobile devices. Existing quantization techniques struggle to maintain performance even in 8-bit precision due to the diffusion model's unique property of temporal variation in activation. We introduce a novel quantization method that dynamically adjusts the quantization interval based on time step information, significantly improving output quality. Unlike conventional dynamic quantization techniques, our approach has no computational overhead during inference and is compatible with both post-training quantization (PTQ) and quantization-aware training (QAT). Our extensive experiments demonstrate substantial improvements in output quality with the quantized diffusion model across various datasets.

Via

Access Paper or Ask Questions