Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yanmei Wang

Vision and Language Integration for Domain Generalization

Apr 17, 2025

Yanmei Wang, Xiyao Liu, Fupeng Chu, Zhi Han

Figure 1 for Vision and Language Integration for Domain Generalization

Figure 2 for Vision and Language Integration for Domain Generalization

Figure 3 for Vision and Language Integration for Domain Generalization

Figure 4 for Vision and Language Integration for Domain Generalization

Abstract:Domain generalization aims at training on source domains to uncover a domain-invariant feature space, allowing the model to perform robust generalization ability on unknown target domains. However, due to domain gaps, it is hard to find reliable common image feature space, and the reason for that is the lack of suitable basic units for images. Different from image in vision space, language has comprehensive expression elements that can effectively convey semantics. Inspired by the semantic completeness of language and intuitiveness of image, we propose VLCA, which combine language space and vision space, and connect the multiple image domains by using semantic space as the bridge domain. Specifically, in language space, by taking advantage of the completeness of language basic units, we tend to capture the semantic representation of the relations between categories through word vector distance. Then, in vision space, by taking advantage of the intuitiveness of image features, the common pattern of sample features with the same class is explored through low-rank approximation. In the end, the language representation is aligned with the vision representation through the multimodal space of text and image. Experiments demonstrate the effectiveness of the proposed method.

Via

Access Paper or Ask Questions

Review helps learn better: Temporal Supervised Knowledge Distillation

Jul 03, 2023

Dongwei Wang, Zhi Han, Yanmei Wang, Xiai Chen, Baichen Liu, Yandong Tang

Figure 1 for Review helps learn better: Temporal Supervised Knowledge Distillation

Figure 2 for Review helps learn better: Temporal Supervised Knowledge Distillation

Figure 3 for Review helps learn better: Temporal Supervised Knowledge Distillation

Figure 4 for Review helps learn better: Temporal Supervised Knowledge Distillation

Abstract:Reviewing plays an important role when learning knowledge. The knowledge acquisition at a certain time point may be strongly inspired with the help of previous experience. Thus the knowledge growing procedure should show strong relationship along the temporal dimension. In our research, we find that during the network training, the evolution of feature map follows temporal sequence property. A proper temporal supervision may further improve the network training performance. Inspired by this observation, we design a novel knowledge distillation method. Specifically, we extract the spatiotemporal features in the different training phases of student by convolutional Long Short-term memory network (Conv-LSTM). Then, we train the student net through a dynamic target, rather than static teacher network features. This process realizes the refinement of old knowledge in student network, and utilizes them to assist current learning. Extensive experiments verify the effectiveness and advantages of our method over existing knowledge distillation methods, including various network architectures, different tasks (image classification and object detection) .

* Under review in NIPS 2023

Via

Access Paper or Ask Questions