Picture for Yifei Zhou

Yifei Zhou

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Add code
Jun 14, 2024
Viaarxiv icon

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Add code
May 17, 2024
Figure 1 for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Figure 2 for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Figure 3 for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Figure 4 for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Viaarxiv icon

Autonomous Evaluation and Refinement of Digital Agents

Add code
Apr 10, 2024
Figure 1 for Autonomous Evaluation and Refinement of Digital Agents
Figure 2 for Autonomous Evaluation and Refinement of Digital Agents
Figure 3 for Autonomous Evaluation and Refinement of Digital Agents
Figure 4 for Autonomous Evaluation and Refinement of Digital Agents
Viaarxiv icon

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

Add code
Feb 29, 2024
Figure 1 for ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Figure 2 for ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Figure 3 for ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Figure 4 for ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Viaarxiv icon

Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees

Add code
Nov 14, 2023
Viaarxiv icon

Distribution Normalization: An "Effortless" Test-Time Augmentation for Contrastively Learned Visual-language Models

Add code
Feb 22, 2023
Figure 1 for Distribution Normalization: An "Effortless" Test-Time Augmentation for Contrastively Learned Visual-language Models
Figure 2 for Distribution Normalization: An "Effortless" Test-Time Augmentation for Contrastively Learned Visual-language Models
Figure 3 for Distribution Normalization: An "Effortless" Test-Time Augmentation for Contrastively Learned Visual-language Models
Figure 4 for Distribution Normalization: An "Effortless" Test-Time Augmentation for Contrastively Learned Visual-language Models
Viaarxiv icon

$BT^2$: Backward-compatible Training with Basis Transformation

Add code
Nov 08, 2022
Figure 1 for $BT^2$: Backward-compatible Training with Basis Transformation
Figure 2 for $BT^2$: Backward-compatible Training with Basis Transformation
Figure 3 for $BT^2$: Backward-compatible Training with Basis Transformation
Figure 4 for $BT^2$: Backward-compatible Training with Basis Transformation
Viaarxiv icon

Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient

Add code
Oct 13, 2022
Figure 1 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Figure 2 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Figure 3 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Figure 4 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Viaarxiv icon

Evaluating Point Cloud Quality via Transformational Complexity

Add code
Oct 10, 2022
Figure 1 for Evaluating Point Cloud Quality via Transformational Complexity
Figure 2 for Evaluating Point Cloud Quality via Transformational Complexity
Figure 3 for Evaluating Point Cloud Quality via Transformational Complexity
Figure 4 for Evaluating Point Cloud Quality via Transformational Complexity
Viaarxiv icon

GAPX: Generalized Autoregressive Paraphrase-Identification X

Add code
Oct 05, 2022
Figure 1 for GAPX: Generalized Autoregressive Paraphrase-Identification X
Figure 2 for GAPX: Generalized Autoregressive Paraphrase-Identification X
Figure 3 for GAPX: Generalized Autoregressive Paraphrase-Identification X
Figure 4 for GAPX: Generalized Autoregressive Paraphrase-Identification X
Viaarxiv icon