Picture for Wenjing Yang

Wenjing Yang

Detecting Unobserved Confounders: A Kernelized Regression Approach

Add code
Jan 01, 2026
Viaarxiv icon

BeyondFacial: Identity-Preserving Personalized Generation Beyond Facial Close-ups

Add code
Nov 15, 2025
Figure 1 for BeyondFacial: Identity-Preserving Personalized Generation Beyond Facial Close-ups
Figure 2 for BeyondFacial: Identity-Preserving Personalized Generation Beyond Facial Close-ups
Figure 3 for BeyondFacial: Identity-Preserving Personalized Generation Beyond Facial Close-ups
Figure 4 for BeyondFacial: Identity-Preserving Personalized Generation Beyond Facial Close-ups
Viaarxiv icon

Breaking the Gradient Barrier: Unveiling Large Language Models for Strategic Classification

Add code
Nov 10, 2025
Viaarxiv icon

ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMs

Add code
Jun 12, 2025
Viaarxiv icon

OmniEarth-Bench: Towards Holistic Evaluation of Earth's Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data

Add code
May 29, 2025
Viaarxiv icon

GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution

Add code
May 27, 2025
Figure 1 for GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Figure 2 for GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Figure 3 for GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Figure 4 for GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Viaarxiv icon

MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios

Add code
May 27, 2025
Viaarxiv icon

Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Add code
Apr 14, 2025
Figure 1 for Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Figure 2 for Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Figure 3 for Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Figure 4 for Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Viaarxiv icon

XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?

Add code
Mar 31, 2025
Viaarxiv icon

Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability

Add code
Mar 26, 2025
Figure 1 for Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability
Figure 2 for Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability
Figure 3 for Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability
Figure 4 for Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability
Viaarxiv icon