Picture for Ying Shen

Ying Shen

UPN

LaTtE-Flow: Layerwise Timestep-Expert Flow-based Transformer

Add code
Jun 08, 2025
Viaarxiv icon

R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation

Add code
May 29, 2025
Viaarxiv icon

DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?

Add code
May 22, 2025
Viaarxiv icon

LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates

Add code
Mar 20, 2025
Viaarxiv icon

MindGYM: Enhancing Vision-Language Models via Synthetic Self-Challenging Questions

Add code
Mar 12, 2025
Viaarxiv icon

A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models

Add code
Feb 22, 2025
Figure 1 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Figure 2 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Figure 3 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Figure 4 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Viaarxiv icon

Refine Knowledge of Large Language Models via Adaptive Contrastive Learning

Add code
Feb 11, 2025
Viaarxiv icon

Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data

Add code
Feb 05, 2025
Viaarxiv icon

HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data

Add code
Dec 23, 2024
Figure 1 for HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data
Figure 2 for HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data
Figure 3 for HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data
Figure 4 for HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data
Viaarxiv icon

Intent-driven In-context Learning for Few-shot Dialogue State Tracking

Add code
Dec 04, 2024
Viaarxiv icon