Picture for Hanrong Ye

Hanrong Ye

MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs

Add code
Jul 01, 2024
Figure 1 for MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
Figure 2 for MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
Figure 3 for MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
Figure 4 for MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
Viaarxiv icon

X-VILA: Cross-Modality Alignment for Large Language Model

Add code
May 29, 2024
Viaarxiv icon

DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data

Add code
Mar 22, 2024
Viaarxiv icon

SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis

Add code
Nov 06, 2023
Viaarxiv icon

TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts

Add code
Jul 28, 2023
Figure 1 for TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
Figure 2 for TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
Figure 3 for TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
Figure 4 for TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
Viaarxiv icon

Contrastive Multi-Task Dense Prediction

Add code
Jul 16, 2023
Figure 1 for Contrastive Multi-Task Dense Prediction
Figure 2 for Contrastive Multi-Task Dense Prediction
Figure 3 for Contrastive Multi-Task Dense Prediction
Figure 4 for Contrastive Multi-Task Dense Prediction
Viaarxiv icon

InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding

Add code
Jun 08, 2023
Figure 1 for InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Figure 2 for InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Figure 3 for InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Figure 4 for InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Viaarxiv icon

Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation

Add code
Apr 06, 2023
Figure 1 for Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation
Figure 2 for Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation
Viaarxiv icon

Inverted Pyramid Multi-task Transformer for Dense Scene Understanding

Add code
Mar 15, 2022
Figure 1 for Inverted Pyramid Multi-task Transformer for Dense Scene Understanding
Figure 2 for Inverted Pyramid Multi-task Transformer for Dense Scene Understanding
Figure 3 for Inverted Pyramid Multi-task Transformer for Dense Scene Understanding
Figure 4 for Inverted Pyramid Multi-task Transformer for Dense Scene Understanding
Viaarxiv icon

Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification

Add code
Jun 05, 2020
Figure 1 for Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification
Figure 2 for Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification
Figure 3 for Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification
Figure 4 for Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification
Viaarxiv icon