Picture for Ruoxi Ning

Ruoxi Ning

From Behavioral Performance to Internal Competence: Interpreting Vision-Language Models with VLM-Lens

Add code
Oct 02, 2025
Viaarxiv icon

Logical Reasoning in Large Language Models: A Survey

Add code
Feb 13, 2025
Figure 1 for Logical Reasoning in Large Language Models: A Survey
Figure 2 for Logical Reasoning in Large Language Models: A Survey
Figure 3 for Logical Reasoning in Large Language Models: A Survey
Viaarxiv icon

NovelQA: A Benchmark for Long-Range Novel Question Answering

Add code
Mar 18, 2024
Figure 1 for NovelQA: A Benchmark for Long-Range Novel Question Answering
Figure 2 for NovelQA: A Benchmark for Long-Range Novel Question Answering
Figure 3 for NovelQA: A Benchmark for Long-Range Novel Question Answering
Figure 4 for NovelQA: A Benchmark for Long-Range Novel Question Answering
Viaarxiv icon

GLoRE: Evaluating Logical Reasoning of Large Language Models

Add code
Oct 13, 2023
Figure 1 for GLoRE: Evaluating Logical Reasoning of Large Language Models
Figure 2 for GLoRE: Evaluating Logical Reasoning of Large Language Models
Figure 3 for GLoRE: Evaluating Logical Reasoning of Large Language Models
Figure 4 for GLoRE: Evaluating Logical Reasoning of Large Language Models
Viaarxiv icon

Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4

Add code
Apr 20, 2023
Figure 1 for Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4
Figure 2 for Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4
Figure 3 for Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4
Figure 4 for Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4
Viaarxiv icon