Picture for Zihao Wang

Zihao Wang

Michael Pokorny

Transformers for Complex Query Answering over Knowledge Hypergraphs

Add code
Apr 23, 2025
Viaarxiv icon

Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Add code
Apr 14, 2025
Viaarxiv icon

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Add code
Apr 04, 2025
Viaarxiv icon

Generative Evaluation of Complex Reasoning in Large Language Models

Add code
Apr 03, 2025
Viaarxiv icon

A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives: Data, Methods, and Challenges

Add code
Apr 01, 2025
Viaarxiv icon

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Add code
Mar 20, 2025
Viaarxiv icon

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization

Add code
Mar 11, 2025
Viaarxiv icon

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Add code
Mar 11, 2025
Viaarxiv icon

UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering

Add code
Feb 26, 2025
Viaarxiv icon

Does Editing Provide Evidence for Localization?

Add code
Feb 19, 2025
Viaarxiv icon