Alert button
Picture for Xiangxi Mo

Xiangxi Mo

Alert button

Context-Aware Streaming Perception in Dynamic Environments

Aug 16, 2022
Gur-Eyal Sela, Ionel Gog, Justin Wong, Kumar Krishna Agrawal, Xiangxi Mo, Sukrit Kalra, Peter Schafhalter, Eric Leong, Xin Wang, Bharathan Balaji, Joseph Gonzalez, Ion Stoica

Figure 1 for Context-Aware Streaming Perception in Dynamic Environments
Figure 2 for Context-Aware Streaming Perception in Dynamic Environments
Figure 3 for Context-Aware Streaming Perception in Dynamic Environments
Figure 4 for Context-Aware Streaming Perception in Dynamic Environments

Efficient vision works maximize accuracy under a latency budget. These works evaluate accuracy offline, one image at a time. However, real-time vision applications like autonomous driving operate in streaming settings, where ground truth changes between inference start and finish. This results in a significant accuracy drop. Therefore, a recent work proposed to maximize accuracy in streaming settings on average. In this paper, we propose to maximize streaming accuracy for every environment context. We posit that scenario difficulty influences the initial (offline) accuracy difference, while obstacle displacement in the scene affects the subsequent accuracy degradation. Our method, Octopus, uses these scenario properties to select configurations that maximize streaming accuracy at test time. Our method improves tracking performance (S-MOTA) by 7.4% over the conventional static approach. Further, performance improvement using our method comes in addition to, and not instead of, advances in offline accuracy.

* 26 pages, 10 figures, to be published in ECCV 2022 
Viaarxiv icon

Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning

Jun 12, 2019
Xiangxi Mo, Ruizhe Cheng, Tianyi Fang

Figure 1 for Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning
Figure 2 for Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning
Figure 3 for Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning
Figure 4 for Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning

We propose an efficient transfer learning method for adapting ImageNet pre-trained Convolutional Neural Network (CNN) to fine-grained image classification task. Conventional transfer learning methods typically face the trade-off between training time and accuracy. By adding "attention module" to each convolutional filters of the pre-trained network, we are able to rank and adjust the importance of each convolutional signal in an end-to-end pipeline. In this report, we show our method can adapt a pre-trianed ResNet50 for a fine-grained transfer learning task within few epochs and achieve accuracy above conventional transfer learning methods and close to models trained from scratch. Our model also offer interpretable result because the rank of the convolutional signal shows which convolution channels are utilized and amplified to achieve better classification result, as well as which signal should be treated as noise for the specific transfer learning task, which could be pruned to lower model size.

Viaarxiv icon

The OoO VLIW JIT Compiler for GPU Inference

Jan 31, 2019
Paras Jain, Xiangxi Mo, Ajay Jain, Alexey Tumanov, Joseph E. Gonzalez, Ion Stoica

Figure 1 for The OoO VLIW JIT Compiler for GPU Inference
Figure 2 for The OoO VLIW JIT Compiler for GPU Inference
Figure 3 for The OoO VLIW JIT Compiler for GPU Inference
Figure 4 for The OoO VLIW JIT Compiler for GPU Inference

Current trends in Machine Learning~(ML) inference on hardware accelerated devices (e.g., GPUs, TPUs) point to alarmingly low utilization. As ML inference is increasingly time-bounded by tight latency SLOs, increasing data parallelism is not an option. The need for better efficiency motivates GPU multiplexing. Furthermore, existing GPU programming abstractions force programmers to micro-manage GPU resources in an early-binding, context-free fashion. We propose a VLIW-inspired Out-of-Order (OoO) Just-in-Time (JIT) compiler that coalesces and reorders execution kernels at runtime for throughput-optimal device utilization while satisfying latency SLOs. We quantify the inefficiencies of space-only and time-only multiplexing alternatives and demonstrate an achievable 7.7x opportunity gap through spatial coalescing.

Viaarxiv icon