Picture for Fangyu Lei

Fangyu Lei

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Add code
Jul 15, 2024
Viaarxiv icon

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Add code
Apr 11, 2024
Figure 1 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Figure 2 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Figure 3 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Figure 4 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Viaarxiv icon

Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

Add code
Mar 01, 2024
Figure 1 for Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent
Figure 2 for Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent
Figure 3 for Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent
Figure 4 for Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent
Viaarxiv icon

MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models

Add code
Feb 20, 2024
Viaarxiv icon

Competition-Level Problems are Effective LLM Evaluators

Add code
Dec 05, 2023
Figure 1 for Competition-Level Problems are Effective LLM Evaluators
Figure 2 for Competition-Level Problems are Effective LLM Evaluators
Figure 3 for Competition-Level Problems are Effective LLM Evaluators
Figure 4 for Competition-Level Problems are Effective LLM Evaluators
Viaarxiv icon

Assessing Knowledge Editing in Language Models via Relation Perspective

Add code
Nov 15, 2023
Figure 1 for Assessing Knowledge Editing in Language Models via Relation Perspective
Figure 2 for Assessing Knowledge Editing in Language Models via Relation Perspective
Figure 3 for Assessing Knowledge Editing in Language Models via Relation Perspective
Figure 4 for Assessing Knowledge Editing in Language Models via Relation Perspective
Viaarxiv icon

TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering

Add code
Oct 23, 2023
Figure 1 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering
Figure 2 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering
Figure 3 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering
Figure 4 for TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering
Viaarxiv icon

S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models

Add code
Oct 23, 2023
Figure 1 for S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
Figure 2 for S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
Figure 3 for S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
Figure 4 for S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
Viaarxiv icon

MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models

Add code
Oct 08, 2023
Figure 1 for MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
Figure 2 for MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
Figure 3 for MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
Figure 4 for MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
Viaarxiv icon

HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering

Add code
Sep 22, 2023
Figure 1 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering
Figure 2 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering
Figure 3 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering
Figure 4 for HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering
Viaarxiv icon