Picture for Percy Liang

Percy Liang

Shammie

Generative Agent Simulations of 1,000 People

Add code
Nov 15, 2024
Figure 1 for Generative Agent Simulations of 1,000 People
Figure 2 for Generative Agent Simulations of 1,000 People
Figure 3 for Generative Agent Simulations of 1,000 People
Figure 4 for Generative Agent Simulations of 1,000 People
Viaarxiv icon

Vocal Sandbox: Continual Learning and Adaptation for Situated Human-Robot Collaboration

Add code
Nov 04, 2024
Figure 1 for Vocal Sandbox: Continual Learning and Adaptation for Situated Human-Robot Collaboration
Figure 2 for Vocal Sandbox: Continual Learning and Adaptation for Situated Human-Robot Collaboration
Figure 3 for Vocal Sandbox: Continual Learning and Adaptation for Situated Human-Robot Collaboration
Figure 4 for Vocal Sandbox: Continual Learning and Adaptation for Situated Human-Robot Collaboration
Viaarxiv icon

Image2Struct: Benchmarking Structure Extraction for Vision-Language Models

Add code
Oct 29, 2024
Figure 1 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Figure 2 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Figure 3 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Figure 4 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Viaarxiv icon

Model Equality Testing: Which Model Is This API Serving?

Add code
Oct 26, 2024
Viaarxiv icon

VideoAgent: Self-Improving Video Generation

Add code
Oct 15, 2024
Figure 1 for VideoAgent: Self-Improving Video Generation
Figure 2 for VideoAgent: Self-Improving Video Generation
Figure 3 for VideoAgent: Self-Improving Video Generation
Figure 4 for VideoAgent: Self-Improving Video Generation
Viaarxiv icon

Language model developers should report train-test overlap

Add code
Oct 10, 2024
Figure 1 for Language model developers should report train-test overlap
Figure 2 for Language model developers should report train-test overlap
Viaarxiv icon

Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making

Add code
Oct 09, 2024
Figure 1 for Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Figure 2 for Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Figure 3 for Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Figure 4 for Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Viaarxiv icon

VHELM: A Holistic Evaluation of Vision Language Models

Add code
Oct 09, 2024
Figure 1 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 2 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 3 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 4 for VHELM: A Holistic Evaluation of Vision Language Models
Viaarxiv icon

Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective

Add code
Oct 07, 2024
Figure 1 for Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective
Figure 2 for Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective
Figure 3 for Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective
Figure 4 for Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective
Viaarxiv icon

Instruction Following without Instruction Tuning

Add code
Sep 21, 2024
Figure 1 for Instruction Following without Instruction Tuning
Figure 2 for Instruction Following without Instruction Tuning
Figure 3 for Instruction Following without Instruction Tuning
Figure 4 for Instruction Following without Instruction Tuning
Viaarxiv icon