Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Haibo Su

Where to Look and How to Describe: Fashion Image Retrieval with an Attentional Heterogeneous Bilinear Network

Oct 26, 2020

Haibo Su, Peng Wang, Lingqiao Liu, Hui Li, Zhen Li, Yanning Zhang

Figure 1 for Where to Look and How to Describe: Fashion Image Retrieval with an Attentional Heterogeneous Bilinear Network

Figure 2 for Where to Look and How to Describe: Fashion Image Retrieval with an Attentional Heterogeneous Bilinear Network

Figure 3 for Where to Look and How to Describe: Fashion Image Retrieval with an Attentional Heterogeneous Bilinear Network

Figure 4 for Where to Look and How to Describe: Fashion Image Retrieval with an Attentional Heterogeneous Bilinear Network

Abstract:Fashion products typically feature in compositions of a variety of styles at different clothing parts. In order to distinguish images of different fashion products, we need to extract both appearance (i.e., "how to describe") and localization (i.e.,"where to look") information, and their interactions. To this end, we propose a biologically inspired framework for image-based fashion product retrieval, which mimics the hypothesized twostream visual processing system of human brain. The proposed attentional heterogeneous bilinear network (AHBN) consists of two branches: a deep CNN branch to extract fine-grained appearance attributes and a fully convolutional branch to extract landmark localization information. A joint channel-wise attention mechanism is further applied to the extracted heterogeneous features to focus on important channels, followed by a compact bilinear pooling layer to model the interaction of the two streams. Our proposed framework achieves satisfactory performance on three image-based fashion product retrieval benchmarks.

* 11 pages, 8 figures

Via

Access Paper or Ask Questions

To Tune or Not To Tune? How About the Best of Both Worlds?

Jul 09, 2019

Ran Wang, Haibo Su, Chunye Wang, Kailin Ji, Jupeng Ding

Figure 1 for To Tune or Not To Tune? How About the Best of Both Worlds?

Figure 2 for To Tune or Not To Tune? How About the Best of Both Worlds?

Figure 3 for To Tune or Not To Tune? How About the Best of Both Worlds?

Figure 4 for To Tune or Not To Tune? How About the Best of Both Worlds?

Abstract:The introduction of pre-trained language models has revolutionized natural language research communities. However, researchers still know relatively little regarding their theoretical and empirical properties. In this regard, Peters et al. perform several experiments which demonstrate that it is better to adapt BERT with a light-weight task-specific head, rather than building a complex one on top of the pre-trained language model, and freeze the parameters in the said language model. However, there is another option to adopt. In this paper, we propose a new adaptation method which we first train the task model with the BERT parameters frozen and then fine-tune the entire model together. Our experimental results show that our model adaptation method can achieve 4.7% accuracy improvement in semantic similarity task, 0.99% accuracy improvement in sequence labeling task and 0.72% accuracy improvement in the text classification task.

Via

Access Paper or Ask Questions