Picture for Jiayang Cheng

Jiayang Cheng

NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents

Add code
Oct 08, 2025
Viaarxiv icon

Top Ten Challenges Towards Agentic Neural Graph Databases

Add code
Jan 24, 2025
Viaarxiv icon

ActPlan-1K: Benchmarking the Procedural Planning Ability of Visual Language Models in Household Activities

Add code
Oct 04, 2024
Figure 1 for ActPlan-1K: Benchmarking the Procedural Planning Ability of Visual Language Models in Household Activities
Figure 2 for ActPlan-1K: Benchmarking the Procedural Planning Ability of Visual Language Models in Household Activities
Figure 3 for ActPlan-1K: Benchmarking the Procedural Planning Ability of Visual Language Models in Household Activities
Figure 4 for ActPlan-1K: Benchmarking the Procedural Planning Ability of Visual Language Models in Household Activities
Viaarxiv icon

Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory

Add code
Aug 19, 2024
Figure 1 for Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory
Figure 2 for Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory
Figure 3 for Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory
Figure 4 for Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory
Viaarxiv icon

RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation

Add code
Aug 15, 2024
Figure 1 for RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation
Figure 2 for RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation
Figure 3 for RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation
Figure 4 for RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation
Viaarxiv icon

LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

Add code
Jun 25, 2024
Figure 1 for LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing
Figure 2 for LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing
Figure 3 for LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing
Figure 4 for LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing
Viaarxiv icon

CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning

Add code
Jan 14, 2024
Viaarxiv icon

Self-Consistent Narrative Prompts on Abductive Natural Language Inference

Add code
Sep 15, 2023
Figure 1 for Self-Consistent Narrative Prompts on Abductive Natural Language Inference
Figure 2 for Self-Consistent Narrative Prompts on Abductive Natural Language Inference
Figure 3 for Self-Consistent Narrative Prompts on Abductive Natural Language Inference
Figure 4 for Self-Consistent Narrative Prompts on Abductive Natural Language Inference
Viaarxiv icon

ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations

Add code
May 11, 2023
Figure 1 for ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations
Figure 2 for ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations
Figure 3 for ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations
Figure 4 for ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations
Viaarxiv icon

DiscoPrompt: Path Prediction Prompt Tuning for Implicit Discourse Relation Recognition

Add code
May 06, 2023
Figure 1 for DiscoPrompt: Path Prediction Prompt Tuning for Implicit Discourse Relation Recognition
Figure 2 for DiscoPrompt: Path Prediction Prompt Tuning for Implicit Discourse Relation Recognition
Figure 3 for DiscoPrompt: Path Prediction Prompt Tuning for Implicit Discourse Relation Recognition
Figure 4 for DiscoPrompt: Path Prediction Prompt Tuning for Implicit Discourse Relation Recognition
Viaarxiv icon