Picture for Yazhe Niu

Yazhe Niu

Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning Framework

Add code
May 22, 2025
Viaarxiv icon

Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning

Add code
Apr 15, 2025
Viaarxiv icon

Empowering LLMs in Decision Games through Algorithmic Data Synthesis

Add code
Mar 18, 2025
Viaarxiv icon

Hierarchical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM

Add code
Mar 10, 2025
Viaarxiv icon

Revisiting Generative Policies: A Simpler Reinforcement Learning Algorithmic Perspective

Add code
Dec 02, 2024
Viaarxiv icon

Pretrained Reversible Generation as Unsupervised Visual Representation Learning

Add code
Nov 29, 2024
Viaarxiv icon

PsyDI: Towards a Personalized and Progressively In-depth Chatbot for Psychological Measurements

Add code
Jul 22, 2024
Viaarxiv icon

UniZero: Generalized and Efficient Planning with Scalable Latent World Models

Add code
Jun 15, 2024
Figure 1 for UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Figure 2 for UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Figure 3 for UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Figure 4 for UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Viaarxiv icon

ReZero: Boosting MCTS-based Algorithms by Just-in-Time and Speedy Reanalyze

Add code
Apr 28, 2024
Viaarxiv icon

A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning

Add code
Dec 12, 2023
Figure 1 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Figure 2 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Figure 3 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Figure 4 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Viaarxiv icon