Picture for Miao Liu

Miao Liu

Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning

Add code
Mar 24, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

The Effectiveness of Approximate Regularized Replay for Efficient Supervised Fine-Tuning of Large Language Models

Add code
Dec 26, 2025
Viaarxiv icon

In the Eye of MLLM: Benchmarking Egocentric Video Intent Understanding with Gaze-Guided Prompting

Add code
Sep 09, 2025
Viaarxiv icon

Q-function Decomposition with Intervention Semantics with Factored Action Spaces

Add code
Apr 30, 2025
Viaarxiv icon

Learning Predictive Visuomotor Coordination

Add code
Mar 30, 2025
Viaarxiv icon

A Generalist Hanabi Agent

Add code
Mar 17, 2025
Viaarxiv icon

ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos

Add code
Mar 16, 2025
Figure 1 for ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos
Figure 2 for ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos
Figure 3 for ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos
Figure 4 for ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos
Viaarxiv icon

X-LeBench: A Benchmark for Extremely Long Egocentric Video Understanding

Add code
Jan 12, 2025
Viaarxiv icon

Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

Add code
Jan 08, 2025
Figure 1 for Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
Figure 2 for Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
Figure 3 for Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
Figure 4 for Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
Viaarxiv icon