Picture for Ming Zhang

Ming Zhang

Masked Face Recognition under Different Backbones

Add code
Jan 23, 2026
Viaarxiv icon

Can Deep Research Agents Find and Organize? Evaluating the Synthesis Gap with Expert Taxonomies

Add code
Jan 18, 2026
Viaarxiv icon

Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control

Add code
Jan 08, 2026
Viaarxiv icon

OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment

Add code
Jan 04, 2026
Viaarxiv icon

OxygenREC: An Instruction-Following Generative Framework for E-commerce Recommendation

Add code
Dec 31, 2025
Viaarxiv icon

VSA:Visual-Structural Alignment for UI-to-Code

Add code
Dec 23, 2025
Figure 1 for VSA:Visual-Structural Alignment for UI-to-Code
Figure 2 for VSA:Visual-Structural Alignment for UI-to-Code
Figure 3 for VSA:Visual-Structural Alignment for UI-to-Code
Viaarxiv icon

Modular Layout Synthesis (MLS): Front-end Code via Structure Normalization and Constrained Generation

Add code
Dec 22, 2025
Viaarxiv icon

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Add code
Nov 06, 2025
Figure 1 for Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Figure 2 for Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Figure 3 for Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Figure 4 for Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Viaarxiv icon

A Survey on Efficient Large Language Model Training: From Data-centric Perspectives

Add code
Oct 29, 2025
Viaarxiv icon

Automated Genomic Interpretation via Concept Bottleneck Models for Medical Robotics

Add code
Oct 02, 2025
Viaarxiv icon