Picture for Di Zhang

Di Zhang

EVLM: An Efficient Vision-Language Model for Visual Understanding

Add code
Jul 19, 2024
Viaarxiv icon

PlacidDreamer: Advancing Harmony in Text-to-3D Generation

Add code
Jul 19, 2024
Viaarxiv icon

LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control

Add code
Jul 03, 2024
Viaarxiv icon

Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model

Add code
Jun 28, 2024
Viaarxiv icon

Long-Term Prediction Accuracy Improvement of Data-Driven Medium-Range Global Weather Forecast

Add code
Jun 26, 2024
Figure 1 for Long-Term Prediction Accuracy Improvement of Data-Driven Medium-Range Global Weather Forecast
Figure 2 for Long-Term Prediction Accuracy Improvement of Data-Driven Medium-Range Global Weather Forecast
Figure 3 for Long-Term Prediction Accuracy Improvement of Data-Driven Medium-Range Global Weather Forecast
Figure 4 for Long-Term Prediction Accuracy Improvement of Data-Driven Medium-Range Global Weather Forecast
Viaarxiv icon

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector

Add code
Jun 17, 2024
Viaarxiv icon

Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B

Add code
Jun 11, 2024
Viaarxiv icon

VideoTetris: Towards Compositional Text-to-Video Generation

Add code
Jun 06, 2024
Viaarxiv icon

Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs

Add code
May 24, 2024
Viaarxiv icon

Learning Multi-dimensional Human Preference for Text-to-Image Generation

Add code
May 23, 2024
Figure 1 for Learning Multi-dimensional Human Preference for Text-to-Image Generation
Figure 2 for Learning Multi-dimensional Human Preference for Text-to-Image Generation
Figure 3 for Learning Multi-dimensional Human Preference for Text-to-Image Generation
Figure 4 for Learning Multi-dimensional Human Preference for Text-to-Image Generation
Viaarxiv icon