Picture for Jitao Sang

Jitao Sang

Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks

Add code
Oct 14, 2025
Figure 1 for Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks
Figure 2 for Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks
Figure 3 for Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks
Figure 4 for Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks
Viaarxiv icon

ReInAgent: A Context-Aware GUI Agent Enabling Human-in-the-Loop Mobile Task Navigation

Add code
Oct 09, 2025
Figure 1 for ReInAgent: A Context-Aware GUI Agent Enabling Human-in-the-Loop Mobile Task Navigation
Figure 2 for ReInAgent: A Context-Aware GUI Agent Enabling Human-in-the-Loop Mobile Task Navigation
Figure 3 for ReInAgent: A Context-Aware GUI Agent Enabling Human-in-the-Loop Mobile Task Navigation
Figure 4 for ReInAgent: A Context-Aware GUI Agent Enabling Human-in-the-Loop Mobile Task Navigation
Viaarxiv icon

Efficient Video-to-Audio Generation via Multiple Foundation Models Mapper

Add code
Sep 05, 2025
Figure 1 for Efficient Video-to-Audio Generation via Multiple Foundation Models Mapper
Figure 2 for Efficient Video-to-Audio Generation via Multiple Foundation Models Mapper
Figure 3 for Efficient Video-to-Audio Generation via Multiple Foundation Models Mapper
Figure 4 for Efficient Video-to-Audio Generation via Multiple Foundation Models Mapper
Viaarxiv icon

NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models

Add code
Jun 15, 2025
Figure 1 for NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models
Figure 2 for NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models
Figure 3 for NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models
Figure 4 for NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models
Viaarxiv icon

Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation

Add code
May 21, 2025
Figure 1 for Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation
Figure 2 for Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation
Figure 3 for Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation
Figure 4 for Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation
Viaarxiv icon

Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective

Add code
Mar 14, 2025
Figure 1 for Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective
Figure 2 for Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective
Figure 3 for Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective
Figure 4 for Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective
Viaarxiv icon

Debiased Prompt Tuning in Vision-Language Model without Annotations

Add code
Mar 11, 2025
Figure 1 for Debiased Prompt Tuning in Vision-Language Model without Annotations
Figure 2 for Debiased Prompt Tuning in Vision-Language Model without Annotations
Figure 3 for Debiased Prompt Tuning in Vision-Language Model without Annotations
Figure 4 for Debiased Prompt Tuning in Vision-Language Model without Annotations
Viaarxiv icon

Agent models: Internalizing Chain-of-Action Generation into Reasoning models

Add code
Mar 09, 2025
Figure 1 for Agent models: Internalizing Chain-of-Action Generation into Reasoning models
Figure 2 for Agent models: Internalizing Chain-of-Action Generation into Reasoning models
Figure 3 for Agent models: Internalizing Chain-of-Action Generation into Reasoning models
Figure 4 for Agent models: Internalizing Chain-of-Action Generation into Reasoning models
Viaarxiv icon

Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration

Add code
Feb 25, 2025
Figure 1 for Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration
Figure 2 for Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration
Figure 3 for Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration
Figure 4 for Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration
Viaarxiv icon

OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning

Add code
Dec 22, 2024
Viaarxiv icon