Picture for Jinxiang Xia

Jinxiang Xia

EcoGym: Evaluating LLMs for Long-Horizon Plan-and-Execute in Interactive Economies

Add code
Feb 11, 2026
Viaarxiv icon

Structural Entropy Guided Unsupervised Graph Out-Of-Distribution Detection

Add code
Mar 05, 2025
Viaarxiv icon

FullStack Bench: Evaluating LLMs as Full Stack Coders

Add code
Dec 03, 2024
Figure 1 for FullStack Bench: Evaluating LLMs as Full Stack Coders
Figure 2 for FullStack Bench: Evaluating LLMs as Full Stack Coders
Figure 3 for FullStack Bench: Evaluating LLMs as Full Stack Coders
Figure 4 for FullStack Bench: Evaluating LLMs as Full Stack Coders
Viaarxiv icon