Picture for Peng Li

Peng Li

DJI Innovations Inc

MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities of Large Language Models

Add code
Mar 29, 2024
Figure 1 for MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities of Large Language Models
Figure 2 for MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities of Large Language Models
Figure 3 for MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities of Large Language Models
Figure 4 for MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities of Large Language Models
Viaarxiv icon

Random-coupled Neural Network

Add code
Mar 26, 2024
Viaarxiv icon

ReAct Meets ActRe: Autonomous Annotation of Agent Trajectories for Contrastive Self-Training

Add code
Mar 25, 2024
Viaarxiv icon

StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models

Add code
Mar 13, 2024
Figure 1 for StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
Figure 2 for StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
Figure 3 for StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
Figure 4 for StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
Viaarxiv icon

ToolRerank: Adaptive and Hierarchy-Aware Reranking for Tool Retrieval

Add code
Mar 11, 2024
Viaarxiv icon

Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models

Add code
Feb 27, 2024
Figure 1 for Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models
Figure 2 for Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models
Figure 3 for Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models
Figure 4 for Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models
Viaarxiv icon

Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization

Add code
Feb 27, 2024
Figure 1 for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
Figure 2 for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
Figure 3 for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
Figure 4 for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
Viaarxiv icon

Budget-Constrained Tool Learning with Planning

Add code
Feb 25, 2024
Viaarxiv icon

DEEM: Dynamic Experienced Expert Modeling for Stance Detection

Add code
Feb 23, 2024
Figure 1 for DEEM: Dynamic Experienced Expert Modeling for Stance Detection
Figure 2 for DEEM: Dynamic Experienced Expert Modeling for Stance Detection
Figure 3 for DEEM: Dynamic Experienced Expert Modeling for Stance Detection
Figure 4 for DEEM: Dynamic Experienced Expert Modeling for Stance Detection
Viaarxiv icon

CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models

Add code
Feb 21, 2024
Figure 1 for CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models
Figure 2 for CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models
Figure 3 for CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models
Figure 4 for CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models
Viaarxiv icon