Picture for Jian Hu

Jian Hu

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Add code
Aug 11, 2025
Viaarxiv icon

Uncertainty-quantified Rollout Policy Adaptation for Unlabelled Cross-domain Temporal Grounding

Add code
Aug 08, 2025
Viaarxiv icon

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Add code
May 30, 2025
Viaarxiv icon

ViSMaP: Unsupervised Hour-long Video Summarisation by Meta-Prompting

Add code
Apr 22, 2025
Viaarxiv icon

LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs

Add code
Apr 20, 2025
Figure 1 for LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs
Figure 2 for LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs
Figure 3 for LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs
Figure 4 for LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs
Viaarxiv icon

V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning

Add code
Mar 14, 2025
Viaarxiv icon

CoS: Chain-of-Shot Prompting for Long Video Understanding

Add code
Feb 10, 2025
Figure 1 for CoS: Chain-of-Shot Prompting for Long Video Understanding
Figure 2 for CoS: Chain-of-Shot Prompting for Long Video Understanding
Figure 3 for CoS: Chain-of-Shot Prompting for Long Video Understanding
Figure 4 for CoS: Chain-of-Shot Prompting for Long Video Understanding
Viaarxiv icon

INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation

Add code
Jan 30, 2025
Figure 1 for INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation
Figure 2 for INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation
Figure 3 for INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation
Figure 4 for INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation
Viaarxiv icon

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Add code
Jan 04, 2025
Figure 1 for REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Figure 2 for REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Figure 3 for REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Figure 4 for REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Viaarxiv icon

Ultra-slender Coaxial Antagonistic Tubular Robot for Ambidextrous Manipulation

Add code
Dec 25, 2024
Viaarxiv icon