Picture for Tony Lee

Tony Lee

VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model

Add code
Feb 15, 2026
Viaarxiv icon

Reliable and Responsible Foundation Models: A Comprehensive Survey

Add code
Feb 04, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

RoboReward: General-Purpose Vision-Language Reward Models for Robotics

Add code
Jan 08, 2026
Viaarxiv icon

MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks

Add code
May 26, 2025
Figure 1 for MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks
Figure 2 for MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks
Figure 3 for MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks
Figure 4 for MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks
Viaarxiv icon

Semantic Retrieval at Walmart

Add code
Dec 05, 2024
Figure 1 for Semantic Retrieval at Walmart
Figure 2 for Semantic Retrieval at Walmart
Figure 3 for Semantic Retrieval at Walmart
Figure 4 for Semantic Retrieval at Walmart
Viaarxiv icon

Image2Struct: Benchmarking Structure Extraction for Vision-Language Models

Add code
Oct 29, 2024
Figure 1 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Figure 2 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Figure 3 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Figure 4 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Viaarxiv icon

VHELM: A Holistic Evaluation of Vision Language Models

Add code
Oct 09, 2024
Figure 1 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 2 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 3 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 4 for VHELM: A Holistic Evaluation of Vision Language Models
Viaarxiv icon

Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making

Add code
Oct 09, 2024
Figure 1 for Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Figure 2 for Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Figure 3 for Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Figure 4 for Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Viaarxiv icon

Relevance Filtering for Embedding-based Retrieval

Add code
Aug 09, 2024
Figure 1 for Relevance Filtering for Embedding-based Retrieval
Figure 2 for Relevance Filtering for Embedding-based Retrieval
Figure 3 for Relevance Filtering for Embedding-based Retrieval
Figure 4 for Relevance Filtering for Embedding-based Retrieval
Viaarxiv icon