Picture for Michael Qizhe Shieh

Michael Qizhe Shieh

Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw

Add code
Apr 06, 2026
Viaarxiv icon

Gym-V: A Unified Vision Environment System for Agentic Vision Research

Add code
Mar 17, 2026
Viaarxiv icon

In-Context Reinforcement Learning for Tool Use in Large Language Models

Add code
Mar 09, 2026
Viaarxiv icon

ImageEdit-R1: Boosting Multi-Agent Image Editing via Reinforcement Learning

Add code
Mar 09, 2026
Viaarxiv icon

LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards

Add code
Mar 02, 2026
Viaarxiv icon

Gradually Compacting Large Language Models for Reasoning Like a Boiling Frog

Add code
Feb 04, 2026
Viaarxiv icon

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Add code
Oct 30, 2025
Figure 1 for ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Figure 2 for ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Figure 3 for ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Figure 4 for ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Viaarxiv icon

The Emergence of Abstract Thought in Large Language Models Beyond Any Language

Add code
Jun 11, 2025
Viaarxiv icon

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

Add code
Apr 17, 2025
Viaarxiv icon

Efficient Process Reward Model Training via Active Learning

Add code
Apr 14, 2025
Figure 1 for Efficient Process Reward Model Training via Active Learning
Figure 2 for Efficient Process Reward Model Training via Active Learning
Figure 3 for Efficient Process Reward Model Training via Active Learning
Figure 4 for Efficient Process Reward Model Training via Active Learning
Viaarxiv icon