Picture for Yue Yang

Yue Yang

Institute for Transport Studies, University of Leeds, Leeds LS2 9JT, UK

Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification

Add code
Jul 11, 2024
Viaarxiv icon

BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations

Add code
Jul 03, 2024
Figure 1 for BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations
Figure 2 for BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations
Figure 3 for BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations
Figure 4 for BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations
Viaarxiv icon

Generative prediction of flow field based on the diffusion model

Add code
Jun 30, 2024
Figure 1 for Generative prediction of flow field based on the diffusion model
Figure 2 for Generative prediction of flow field based on the diffusion model
Figure 3 for Generative prediction of flow field based on the diffusion model
Figure 4 for Generative prediction of flow field based on the diffusion model
Viaarxiv icon

PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models

Add code
Jun 17, 2024
Viaarxiv icon

Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality

Add code
Jun 13, 2024
Viaarxiv icon

A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis

Add code
May 23, 2024
Figure 1 for A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
Figure 2 for A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
Figure 3 for A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
Figure 4 for A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
Viaarxiv icon

MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Add code
Apr 24, 2024
Figure 1 for MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Figure 2 for MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Figure 3 for MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Figure 4 for MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Viaarxiv icon

DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model

Add code
Mar 31, 2024
Figure 1 for DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Figure 2 for DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Figure 3 for DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Figure 4 for DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Viaarxiv icon

Augmented Reality Demonstrations for Scalable Robot Imitation Learning

Add code
Mar 20, 2024
Viaarxiv icon

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing

Add code
Mar 20, 2024
Viaarxiv icon