Picture for Yezhou Yang

Yezhou Yang

Arizona State University

TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives

Add code
Nov 04, 2024
Figure 1 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Figure 2 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Figure 3 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Figure 4 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Viaarxiv icon

VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks

Add code
Oct 17, 2024
Figure 1 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Figure 2 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Figure 3 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Figure 4 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Viaarxiv icon

ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions

Add code
Oct 17, 2024
Figure 1 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Figure 2 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Figure 3 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Figure 4 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Viaarxiv icon

Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?

Add code
Oct 17, 2024
Figure 1 for Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Figure 2 for Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Figure 3 for Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Figure 4 for Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Viaarxiv icon

TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning

Add code
Sep 30, 2024
Figure 1 for TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning
Figure 2 for TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning
Figure 3 for TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning
Figure 4 for TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning
Viaarxiv icon

Latent Space Energy-based Neural ODEs

Add code
Sep 05, 2024
Figure 1 for Latent Space Energy-based Neural ODEs
Figure 2 for Latent Space Energy-based Neural ODEs
Figure 3 for Latent Space Energy-based Neural ODEs
Figure 4 for Latent Space Energy-based Neural ODEs
Viaarxiv icon

Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks

Add code
Sep 01, 2024
Figure 1 for Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks
Figure 2 for Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks
Figure 3 for Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks
Figure 4 for Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks
Viaarxiv icon

Recent Event Camera Innovations: A Survey

Add code
Aug 27, 2024
Figure 1 for Recent Event Camera Innovations: A Survey
Figure 2 for Recent Event Camera Innovations: A Survey
Figure 3 for Recent Event Camera Innovations: A Survey
Figure 4 for Recent Event Camera Innovations: A Survey
Viaarxiv icon

SynTraC: A Synthetic Dataset for Traffic Signal Control from Traffic Monitoring Cameras

Add code
Aug 18, 2024
Figure 1 for SynTraC: A Synthetic Dataset for Traffic Signal Control from Traffic Monitoring Cameras
Figure 2 for SynTraC: A Synthetic Dataset for Traffic Signal Control from Traffic Monitoring Cameras
Figure 3 for SynTraC: A Synthetic Dataset for Traffic Signal Control from Traffic Monitoring Cameras
Figure 4 for SynTraC: A Synthetic Dataset for Traffic Signal Control from Traffic Monitoring Cameras
Viaarxiv icon

Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation

Add code
Aug 12, 2024
Figure 1 for Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation
Figure 2 for Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation
Figure 3 for Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation
Figure 4 for Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation
Viaarxiv icon