Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

David Liu

Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning

May 25, 2022

Chong Ma, Lin Zhao, Yuzhong Chen, Lu Zhang, Zhenxiang Xiao, Haixing Dai, David Liu, Zihao Wu, Zhengliang Liu, Sheng Wang(+8 more)

Figure 1 for Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning

Figure 2 for Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning

Figure 3 for Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning

Figure 4 for Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning

Abstract:Learning harmful shortcuts such as spurious correlations and biases prevents deep neural networks from learning the meaningful and useful representations, thus jeopardizing the generalizability and interpretability of the learned representation. The situation becomes even more serious in medical imaging, where the clinical data (e.g., MR images with pathology) are limited and scarce while the reliability, generalizability and transparency of the learned model are highly required. To address this problem, we propose to infuse human experts' intelligence and domain knowledge into the training of deep neural networks. The core idea is that we infuse the visual attention information from expert radiologists to proactively guide the deep model to focus on regions with potential pathology and avoid being trapped in learning harmful shortcuts. To do so, we propose a novel eye-gaze-guided vision transformer (EG-ViT) for diagnosis with limited medical image data. We mask the input image patches that are out of the radiologists' interest and add an additional residual connection in the last encoder layer of EG-ViT to maintain the correlations of all patches. The experiments on two public datasets of INbreast and SIIM-ACR demonstrate our EG-ViT model can effectively learn/transfer experts' domain knowledge and achieve much better performance than baselines. Meanwhile, it successfully rectifies the harmful shortcut learning and significantly improves the EG-ViT model's interpretability. In general, EG-ViT takes the advantages of both human expert's prior knowledge and the power of deep neural networks. This work opens new avenues for advancing current artificial intelligence paradigms by infusing human intelligence.

Via

Access Paper or Ask Questions

Feature-Align Network with Knowledge Distillation for Efficient Denoising

Mar 18, 2021

Lucas D. Young, Fitsum A. Reda, Rakesh Ranjan, Jon Morton, Jun Hu, Yazhu Ling, Xiaoyu Xiang, David Liu, Vikas Chandra

Figure 1 for Feature-Align Network with Knowledge Distillation for Efficient Denoising

Figure 2 for Feature-Align Network with Knowledge Distillation for Efficient Denoising

Figure 3 for Feature-Align Network with Knowledge Distillation for Efficient Denoising

Figure 4 for Feature-Align Network with Knowledge Distillation for Efficient Denoising

Abstract:We propose an efficient neural network for RAW image denoising. Although neural network-based denoising has been extensively studied for image restoration, little attention has been given to efficient denoising for compute limited and power sensitive devices, such as smartphones and smartwatches. In this paper, we present a novel architecture and a suite of training techniques for high quality denoising in mobile devices. Our work is distinguished by three main contributions. (1) Feature-Align layer that modulates the activations of an encoder-decoder architecture with the input noisy images. The auto modulation layer enforces attention to spatially varying noise that tend to be "washed away" by successive application of convolutions and non-linearity. (2) A novel Feature Matching Loss that allows knowledge distillation from large denoising networks in the form of a perceptual content loss. (3) Empirical analysis of our efficient model trained to specialize on different noise subranges. This opens additional avenue for model size reduction by sacrificing memory for compute. Extensive experimental validation shows that our efficient model produces high quality denoising results that compete with state-of-the-art large networks, while using significantly fewer parameters and MACs. On the Darmstadt Noise Dataset benchmark, we achieve a PSNR of 48.28dB, while using 263 times fewer MACs, and 17.6 times fewer parameters than the state-of-the-art network, which achieves 49.12dB.

Via

Access Paper or Ask Questions

RAWLSNET: Altering Bayesian Networks to Encode Rawlsian Fair Equality of Opportunity

Mar 16, 2021

David Liu, Zohair Shafi, William Fleisher, Tina Eliassi-Rad, Scott Alfeld

Figure 1 for RAWLSNET: Altering Bayesian Networks to Encode Rawlsian Fair Equality of Opportunity

Figure 2 for RAWLSNET: Altering Bayesian Networks to Encode Rawlsian Fair Equality of Opportunity

Figure 3 for RAWLSNET: Altering Bayesian Networks to Encode Rawlsian Fair Equality of Opportunity

Figure 4 for RAWLSNET: Altering Bayesian Networks to Encode Rawlsian Fair Equality of Opportunity

Abstract:We present RAWLSNET, a system for altering Bayesian Network (BN) models to satisfy the Rawlsian principle of fair equality of opportunity (FEO). RAWLSNET's BN models generate aspirational data distributions: data generated to reflect an ideally fair, FEO-satisfying society. FEO states that everyone with the same talent and willingness to use it should have the same chance of achieving advantageous social positions (e.g., employment), regardless of their background circumstances (e.g., socioeconomic status). Satisfying FEO requires alterations to social structures such as school assignments. Our paper describes RAWLSNET, a method which takes as input a BN representation of an FEO application and alters the BN's parameters so as to satisfy FEO when possible, and minimize deviation from FEO otherwise. We also offer guidance for applying RAWLSNET, including on recognizing proper applications of FEO. We demonstrate the use of our system with publicly available data sets. RAWLSNET's altered BNs offer the novel capability of generating aspirational data for FEO-relevant tasks. Aspirational data are free from the biases of real-world data, and thus are useful for recognizing and detecting sources of unfairness in machine learning algorithms besides biased data.

* 12 pages

Via

Access Paper or Ask Questions

Evolving Antennas for Ultra-High Energy Neutrino Detection

May 15, 2020

Julie Rolla, Amy Connolly, Kai Staats, Stephanie Wissel, Dean Arakaki, Ian Best, Adam Blenk, Brian Clark, Maximillian Clowdus, Suren Gourapura(+9 more)

Figure 1 for Evolving Antennas for Ultra-High Energy Neutrino Detection

Figure 2 for Evolving Antennas for Ultra-High Energy Neutrino Detection

Figure 3 for Evolving Antennas for Ultra-High Energy Neutrino Detection

Figure 4 for Evolving Antennas for Ultra-High Energy Neutrino Detection

Abstract:Evolutionary algorithms borrow from biology the concepts of mutation and selection in order to evolve optimized solutions to known problems. The GENETIS collaboration is developing genetic algorithms for designing antennas that are more sensitive to ultra-high energy neutrino induced radio pulses than current designs. There are three aspects of this investigation. The first is to evolve simple wire antennas to test the concept and different algorithms. Second, optimized antenna response patterns are evolved for a given array geometry. Finally, antennas themselves are evolved using neutrino sensitivity as a measure of fitness. This is achieved by integrating the XFdtd finite-difference time-domain modeling program with simulations of neutrino experiments.

* 8 pages including references, 6 figures, presented at 36th International Cosmic Ray Conference (ICRC 2019)

Via

Access Paper or Ask Questions

Automatic Vertebra Labeling in Large-Scale 3D CT using Deep Image-to-Image Network with Message Passing and Sparsity Regularization

May 17, 2017

Dong Yang, Tao Xiong, Daguang Xu, Qiangui Huang, David Liu, S. Kevin Zhou, Zhoubing Xu, JinHyeong Park, Mingqing Chen, Trac D. Tran(+3 more)

Figure 1 for Automatic Vertebra Labeling in Large-Scale 3D CT using Deep Image-to-Image Network with Message Passing and Sparsity Regularization

Figure 2 for Automatic Vertebra Labeling in Large-Scale 3D CT using Deep Image-to-Image Network with Message Passing and Sparsity Regularization

Figure 3 for Automatic Vertebra Labeling in Large-Scale 3D CT using Deep Image-to-Image Network with Message Passing and Sparsity Regularization

Figure 4 for Automatic Vertebra Labeling in Large-Scale 3D CT using Deep Image-to-Image Network with Message Passing and Sparsity Regularization

Abstract:Automatic localization and labeling of vertebra in 3D medical images plays an important role in many clinical tasks, including pathological diagnosis, surgical planning and postoperative assessment. However, the unusual conditions of pathological cases, such as the abnormal spine curvature, bright visual imaging artifacts caused by metal implants, and the limited field of view, increase the difficulties of accurate localization. In this paper, we propose an automatic and fast algorithm to localize and label the vertebra centroids in 3D CT volumes. First, we deploy a deep image-to-image network (DI2IN) to initialize vertebra locations, employing the convolutional encoder-decoder architecture together with multi-level feature concatenation and deep supervision. Next, the centroid probability maps from DI2IN are iteratively evolved with the message passing schemes based on the mutual relation of vertebra centroids. Finally, the localization results are refined with sparsity regularization. The proposed method is evaluated on a public dataset of 302 spine CT volumes with various pathologies. Our method outperforms other state-of-the-art methods in terms of localization accuracy. The run time is around 3 seconds on average per case. To further boost the performance, we retrain the DI2IN on additional 1000+ 3D CT volumes from different patients. To the best of our knowledge, this is the first time more than 1000 3D CT volumes with expert annotation are adopted in experiments for the anatomic landmark detection tasks. Our experimental results show that training with such a large dataset significantly improves the performance and the overall identification rate, for the first time by our knowledge, reaches 90 %.

Via

Access Paper or Ask Questions

Visual Search at Pinterest

Mar 08, 2017

Yushi Jing, David Liu, Dmitry Kislyuk, Andrew Zhai, Jiajing Xu, Jeff Donahue, Sarah Tavel

Abstract:We demonstrate that, with the availability of distributed computation platforms such as Amazon Web Services and open-source tools, it is possible for a small engineering team to build, launch and maintain a cost-effective, large-scale visual search system with widely available tools. We also demonstrate, through a comprehensive set of live experiments at Pinterest, that content recommendation powered by visual search improve user engagement. By sharing our implementation details and the experiences learned from launching a commercial visual search engines from scratch, we hope visual search are more widely incorporated into today's commercial applications.

* in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge and Discovery and Data Mining, 2015

Via

Access Paper or Ask Questions

Human Curation and Convnets: Powering Item-to-Item Recommendations on Pinterest

Nov 12, 2015

Dmitry Kislyuk, Yuchen Liu, David Liu, Eric Tzeng, Yushi Jing

Figure 1 for Human Curation and Convnets: Powering Item-to-Item Recommendations on Pinterest

Figure 2 for Human Curation and Convnets: Powering Item-to-Item Recommendations on Pinterest

Figure 3 for Human Curation and Convnets: Powering Item-to-Item Recommendations on Pinterest

Figure 4 for Human Curation and Convnets: Powering Item-to-Item Recommendations on Pinterest

Abstract:This paper presents Pinterest Related Pins, an item-to-item recommendation system that combines collaborative filtering with content-based ranking. We demonstrate that signals derived from user curation, the activity of users organizing content, are highly effective when used in conjunction with content-based ranking. This paper also demonstrates the effectiveness of visual features, such as image or object representations learned from convnets, in improving the user engagement rate of our item-to-item recommendation system.

Via

Access Paper or Ask Questions