Picture for Da Yin

Da Yin

Violet

OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation

Add code
Jul 26, 2024
Figure 1 for OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation
Figure 2 for OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation
Figure 3 for OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation
Figure 4 for OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation
Viaarxiv icon

Consent in Crisis: The Rapid Decline of the AI Data Commons

Add code
Jul 24, 2024
Figure 1 for Consent in Crisis: The Rapid Decline of the AI Data Commons
Figure 2 for Consent in Crisis: The Rapid Decline of the AI Data Commons
Figure 3 for Consent in Crisis: The Rapid Decline of the AI Data Commons
Figure 4 for Consent in Crisis: The Rapid Decline of the AI Data Commons
Viaarxiv icon

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Add code
Jun 18, 2024
Figure 1 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 2 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 3 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Figure 4 for ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Viaarxiv icon

Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents

Add code
Mar 04, 2024
Viaarxiv icon

PST-Bench: Tracing and Benchmarking the Source of Publications

Add code
Feb 25, 2024
Viaarxiv icon

Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs

Add code
Nov 09, 2023
Viaarxiv icon

Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts

Add code
Oct 16, 2023
Figure 1 for Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts
Figure 2 for Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts
Figure 3 for Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts
Figure 4 for Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts
Viaarxiv icon

The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code

Add code
May 30, 2023
Viaarxiv icon

Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation

Add code
May 23, 2023
Viaarxiv icon

KPEval: Towards Fine-grained Semantic-based Evaluation of Keyphrase Extraction and Generation Systems

Add code
Mar 27, 2023
Viaarxiv icon