Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Time": models, code, and papers

MorpheusNet: Resource efficient sleep stage classifier for embedded on-line systems

Jan 14, 2024
Ali Kavoosi, Morgan P. Mitchell, Raveen Kariyawasam, John E. Fleming, Penny Lewis, Heidi Johansen-Berg, Hayriye Cagnan, Timothy Denison

Sleep Stage Classification (SSC) is a labor-intensive task, requiring experts to examine hours of electrophysiological recordings for manual classification. This is a limiting factor when it comes to leveraging sleep stages for therapeutic purposes. With increasing affordability and expansion of wearable devices, automating SSC may enable deployment of sleep-based therapies at scale. Deep Learning has gained increasing attention as a potential method to automate this process. Previous research has shown accuracy comparable to manual expert scores. However, previous approaches require sizable amount of memory and computational resources. This constrains the ability to classify in real time and deploy models on the edge. To address this gap, we aim to provide a model capable of predicting sleep stages in real-time, without requiring access to external computational sources (e.g., mobile phone, cloud). The algorithm is power efficient to enable use on embedded battery powered systems. Our compact sleep stage classifier can be deployed on most off-the-shelf microcontrollers (MCU) with constrained hardware settings. This is due to the memory footprint of our approach requiring significantly fewer operations. The model was tested on three publicly available data bases and achieved performance comparable to the state of the art, whilst reducing model complexity by orders of magnitude (up to 280 times smaller compared to state of the art). We further optimized the model with quantization of parameters to 8 bits with only an average drop of 0.95% in accuracy. When implemented in firmware, the quantized model achieves a latency of 1.6 seconds on an Arm CortexM4 processor, allowing its use for on-line SSC-based therapies.

* This paper was presented at the 2023 IEEE conference on Systems, Man, and Cybernetics (SMC)

Via

Access Paper or Ask Questions

Real-time Neural Network Inference on Extremely Weak Devices: Agile Offloading with Explainable AI

Dec 21, 2023
Kai Huang, Wei Gao

With the wide adoption of AI applications, there is a pressing need of enabling real-time neural network (NN) inference on small embedded devices, but deploying NNs and achieving high performance of NN inference on these small devices is challenging due to their extremely weak capabilities. Although NN partitioning and offloading can contribute to such deployment, they are incapable of minimizing the local costs at embedded devices. Instead, we suggest to address this challenge via agile NN offloading, which migrates the required computations in NN offloading from online inference to offline learning. In this paper, we present AgileNN, a new NN offloading technique that achieves real-time NN inference on weak embedded devices by leveraging eXplainable AI techniques, so as to explicitly enforce feature sparsity during the training phase and minimize the online computation and communication costs. Experiment results show that AgileNN's inference latency is >6x lower than the existing schemes, ensuring that sensory data on embedded devices can be timely consumed. It also reduces the local device's resource consumption by >8x, without impairing the inference accuracy.

* published at ACM MobiCom 2022. 14 pages

Via

Access Paper or Ask Questions

Optimizing Visible Light Communication Efficiency Through Reinforcement Learning-Based NOMA-CSK Integration

Jan 18, 2024
Serkan Vela, Gokce Hacioglu

In this paper, we explore the use of Non-Orthogonal Multiple Access (NOMA) and Color Shift Keying (CSK) for Visible Light Communication (VLC) systems. VLC is a wireless communication technology that uses visible light as the carrier signal to transmit information. It has several advantages over traditional radio frequency communication, including higher bandwidth, lower interference, and greater security. We first provide an introduction to NOMA and CSK and explain how they can be applied to VLC systems. NOMA is a technique that allows multiple users to share the same frequency channel by allocating different power levels to each user. This enables more users to connect to a single VLC transmitter simultaneously, thereby improving system capacity and spectral efficiency. CSK, on the other hand, is a modulation technique that uses different colors of light to represent digital information. By changing the color of the transmitted signal, information can be encoded and decoded at the receiver. Next, we discuss how NOMA and CSK can be combined in VLC systems by using different power levels to represent different users. This allows for more efficient use of the frequency spectrum, as multiple users can share the same channel at the same time. Additionally, we examine the potential benefits of using NOMA and CSK together in VLC systems to increase data rate. Finally, we discuss how reinforcement learning, a machine learning technique used to train agents to make decisions based on environmental feedback, can be used to optimize NOMA-CSK-VLC networks by allowing agents to learn and adapt to changing network conditions. Overall, our paper provides insights into the benefits of combining NOMA and CSK for VLC systems, highlighting the potential for improving communication efficiency and performance.

Via

Access Paper or Ask Questions

StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation

Dec 19, 2023
Akio Kodaira, Chenfeng Xu, Toshiki Hazama, Takanori Yoshimoto, Kohei Ohno, Shogo Mitsuhori, Soichi Sugano, Hanying Cho, Zhijian Liu, Kurt Keutzer

We introduce StreamDiffusion, a real-time diffusion pipeline designed for interactive image generation. Existing diffusion models are adept at creating images from text or image prompts, yet they often fall short in real-time interaction. This limitation becomes particularly evident in scenarios involving continuous input, such as Metaverse, live video streaming, and broadcasting, where high throughput is imperative. To address this, we present a novel approach that transforms the original sequential denoising into the batching denoising process. Stream Batch eliminates the conventional wait-and-interact approach and enables fluid and high throughput streams. To handle the frequency disparity between data input and model throughput, we design a novel input-output queue for parallelizing the streaming process. Moreover, the existing diffusion pipeline uses classifier-free guidance(CFG), which requires additional U-Net computation. To mitigate the redundant computations, we propose a novel residual classifier-free guidance (RCFG) algorithm that reduces the number of negative conditional denoising steps to only one or even zero. Besides, we introduce a stochastic similarity filter(SSF) to optimize power consumption. Our Stream Batch achieves around 1.5x speedup compared to the sequential denoising method at different denoising levels. The proposed RCFG leads to speeds up to 2.05x higher than the conventional CFG. Combining the proposed strategies and existing mature acceleration tools makes the image-to-image generation achieve up-to 91.07fps on one RTX4090, improving the throughputs of AutoPipline developed by Diffusers over 59.56x. Furthermore, our proposed StreamDiffusion also significantly reduces the energy consumption by 2.39x on one RTX3060 and 1.99x on one RTX4090, respectively.

* tech report, the code is available at https://github.com/cumulo-autumn/StreamDiffusion

Via

Access Paper or Ask Questions

Jan 17, 2024
Dunyuan Xu, Xi Wang, Jinyue Cai, Pheng-Ann Heng

Brain tumor represents one of the most fatal cancers around the world, and is very common in children and the elderly. Accurate identification of the type and grade of tumor in the early stages plays an important role in choosing a precise treatment plan. The Magnetic Resonance Imaging (MRI) protocols of different sequences provide clinicians with important contradictory information to identify tumor regions. However, manual assessment is time-consuming and error-prone due to big amount of data and the diversity of brain tumor types. Hence, there is an unmet need for MRI automated brain tumor diagnosis. We observe that the predictive capability of uni-modality models is limited and their performance varies widely across modalities, and the commonly used modality fusion methods would introduce potential noise, which results in significant performance degradation. To overcome these challenges, we propose a novel cross-modality guidance-aided multi-modal learning with dual attention for addressing the task of MRI brain tumor grading. To balance the tradeoff between model efficiency and efficacy, we employ ResNet Mix Convolution as the backbone network for feature extraction. Besides, dual attention is applied to capture the semantic interdependencies in spatial and slice dimensions respectively. To facilitate information interaction among modalities, we design a cross-modality guidance-aided module where the primary modality guides the other secondary modalities during the process of training, which can effectively leverage the complementary information of different MRI modalities and meanwhile alleviate the impact of the possible noise.

Via

Access Paper or Ask Questions