Picture for Daizong Liu

Daizong Liu

Wangxuan Institute of Computer Technology, Peking University

A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends

Add code
Jul 10, 2024
Viaarxiv icon

A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions

Add code
Jun 09, 2024
Viaarxiv icon

Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding

Add code
Nov 06, 2023
Figure 1 for Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding
Figure 2 for Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding
Figure 3 for Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding
Figure 4 for Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding
Viaarxiv icon

Dense Object Grounding in 3D Scenes

Add code
Sep 05, 2023
Figure 1 for Dense Object Grounding in 3D Scenes
Figure 2 for Dense Object Grounding in 3D Scenes
Figure 3 for Dense Object Grounding in 3D Scenes
Figure 4 for Dense Object Grounding in 3D Scenes
Viaarxiv icon

3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack

Add code
Aug 15, 2023
Figure 1 for 3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack
Figure 2 for 3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack
Figure 3 for 3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack
Figure 4 for 3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack
Viaarxiv icon

From Region to Patch: Attribute-Aware Foreground-Background Contrastive Learning for Fine-Grained Fashion Retrieval

Add code
May 17, 2023
Figure 1 for From Region to Patch: Attribute-Aware Foreground-Background Contrastive Learning for Fine-Grained Fashion Retrieval
Figure 2 for From Region to Patch: Attribute-Aware Foreground-Background Contrastive Learning for Fine-Grained Fashion Retrieval
Figure 3 for From Region to Patch: Attribute-Aware Foreground-Background Contrastive Learning for Fine-Grained Fashion Retrieval
Figure 4 for From Region to Patch: Attribute-Aware Foreground-Background Contrastive Learning for Fine-Grained Fashion Retrieval
Viaarxiv icon

Transform-Equivariant Consistency Learning for Temporal Sentence Grounding

Add code
May 06, 2023
Figure 1 for Transform-Equivariant Consistency Learning for Temporal Sentence Grounding
Figure 2 for Transform-Equivariant Consistency Learning for Temporal Sentence Grounding
Figure 3 for Transform-Equivariant Consistency Learning for Temporal Sentence Grounding
Figure 4 for Transform-Equivariant Consistency Learning for Temporal Sentence Grounding
Viaarxiv icon

Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection

Add code
Apr 19, 2023
Figure 1 for Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection
Figure 2 for Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection
Figure 3 for Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection
Figure 4 for Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection
Viaarxiv icon

You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos

Add code
Mar 16, 2023
Figure 1 for You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
Figure 2 for You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
Figure 3 for You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
Figure 4 for You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
Viaarxiv icon

Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in Videos

Add code
Mar 15, 2023
Figure 1 for Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in Videos
Figure 2 for Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in Videos
Figure 3 for Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in Videos
Figure 4 for Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in Videos
Viaarxiv icon