Picture for Fan Yang

Fan Yang

refer to the report for detailed contributions

InstructEngine: Instruction-driven Text-to-Image Alignment

Add code
Apr 14, 2025
Figure 1 for InstructEngine: Instruction-driven Text-to-Image Alignment
Figure 2 for InstructEngine: Instruction-driven Text-to-Image Alignment
Figure 3 for InstructEngine: Instruction-driven Text-to-Image Alignment
Figure 4 for InstructEngine: Instruction-driven Text-to-Image Alignment
Viaarxiv icon

Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models

Add code
Apr 09, 2025
Figure 1 for Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Figure 2 for Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Figure 3 for Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Figure 4 for Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Viaarxiv icon

DSU-Net:An Improved U-Net Model Based on DINOv2 and SAM2 with Multi-scale Cross-model Feature Enhancement

Add code
Mar 31, 2025
Figure 1 for DSU-Net:An Improved U-Net Model Based on DINOv2 and SAM2 with Multi-scale Cross-model Feature Enhancement
Figure 2 for DSU-Net:An Improved U-Net Model Based on DINOv2 and SAM2 with Multi-scale Cross-model Feature Enhancement
Figure 3 for DSU-Net:An Improved U-Net Model Based on DINOv2 and SAM2 with Multi-scale Cross-model Feature Enhancement
Figure 4 for DSU-Net:An Improved U-Net Model Based on DINOv2 and SAM2 with Multi-scale Cross-model Feature Enhancement
Viaarxiv icon

EdgeInfinite: A Memory-Efficient Infinite-Context Transformer for Edge Devices

Add code
Mar 28, 2025
Figure 1 for EdgeInfinite: A Memory-Efficient Infinite-Context Transformer for Edge Devices
Figure 2 for EdgeInfinite: A Memory-Efficient Infinite-Context Transformer for Edge Devices
Figure 3 for EdgeInfinite: A Memory-Efficient Infinite-Context Transformer for Edge Devices
Figure 4 for EdgeInfinite: A Memory-Efficient Infinite-Context Transformer for Edge Devices
Viaarxiv icon

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning

Add code
Mar 27, 2025
Figure 1 for ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
Figure 2 for ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
Figure 3 for ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
Figure 4 for ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
Viaarxiv icon

A Universal Model Combining Differential Equations and Neural Networks for Ball Trajectory Prediction

Add code
Mar 25, 2025
Viaarxiv icon

MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification

Add code
Mar 16, 2025
Figure 1 for MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
Figure 2 for MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
Figure 3 for MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
Figure 4 for MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
Viaarxiv icon

LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs

Add code
Mar 14, 2025
Figure 1 for LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs
Figure 2 for LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs
Figure 3 for LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs
Figure 4 for LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs
Viaarxiv icon

TIME: Temporal-sensitive Multi-dimensional Instruction Tuning and Benchmarking for Video-LLMs

Add code
Mar 13, 2025
Viaarxiv icon

Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding

Add code
Mar 12, 2025
Viaarxiv icon