Picture for Jie Zhu

Jie Zhu

Can Textual Reasoning Improve the Performance of MLLMs on Fine-grained Visual Classification?

Add code
Jan 11, 2026
Viaarxiv icon

MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization

Add code
Jan 08, 2026
Viaarxiv icon

On the Holistic Approach for Detecting Human Image Forgery

Add code
Jan 08, 2026
Viaarxiv icon

Forge-and-Quench: Enhancing Image Generation for Higher Fidelity in Unified Multimodal Models

Add code
Jan 08, 2026
Viaarxiv icon

Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models

Add code
Aug 21, 2025
Viaarxiv icon

A Quality-Guided Mixture of Score-Fusion Experts Framework for Human Recognition

Add code
Jul 31, 2025
Viaarxiv icon

Auditing Data Provenance in Real-world Text-to-Image Diffusion Models for Privacy and Copyright Protection

Add code
Jun 13, 2025
Viaarxiv icon

Gradient Boosting Decision Tree with LSTM for Investment Prediction

Add code
May 29, 2025
Figure 1 for Gradient Boosting Decision Tree with LSTM for Investment Prediction
Viaarxiv icon

Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning

Add code
May 22, 2025
Figure 1 for Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning
Figure 2 for Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning
Figure 3 for Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning
Figure 4 for Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning
Viaarxiv icon

A Unified and Scalable Membership Inference Method for Visual Self-supervised Encoder via Part-aware Capability

Add code
May 15, 2025
Viaarxiv icon