Picture for Xuanjing Huang

Xuanjing Huang

Multi-Programming Language Sandbox for LLMs

Add code
Oct 30, 2024
Figure 1 for Multi-Programming Language Sandbox for LLMs
Figure 2 for Multi-Programming Language Sandbox for LLMs
Figure 3 for Multi-Programming Language Sandbox for LLMs
Figure 4 for Multi-Programming Language Sandbox for LLMs
Viaarxiv icon

ElectionSim: Massive Population Election Simulation Powered by Large Language Model Driven Agents

Add code
Oct 28, 2024
Figure 1 for ElectionSim: Massive Population Election Simulation Powered by Large Language Model Driven Agents
Figure 2 for ElectionSim: Massive Population Election Simulation Powered by Large Language Model Driven Agents
Figure 3 for ElectionSim: Massive Population Election Simulation Powered by Large Language Model Driven Agents
Figure 4 for ElectionSim: Massive Population Election Simulation Powered by Large Language Model Driven Agents
Viaarxiv icon

Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders

Add code
Oct 27, 2024
Figure 1 for Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders
Figure 2 for Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders
Figure 3 for Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders
Figure 4 for Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders
Viaarxiv icon

AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios

Add code
Oct 25, 2024
Figure 1 for AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios
Figure 2 for AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios
Figure 3 for AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios
Figure 4 for AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios
Viaarxiv icon

Distill Visual Chart Reasoning Ability from LLMs to MLLMs

Add code
Oct 24, 2024
Figure 1 for Distill Visual Chart Reasoning Ability from LLMs to MLLMs
Figure 2 for Distill Visual Chart Reasoning Ability from LLMs to MLLMs
Figure 3 for Distill Visual Chart Reasoning Ability from LLMs to MLLMs
Figure 4 for Distill Visual Chart Reasoning Ability from LLMs to MLLMs
Viaarxiv icon

Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs

Add code
Oct 20, 2024
Figure 1 for Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs
Figure 2 for Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs
Figure 3 for Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs
Figure 4 for Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs
Viaarxiv icon

Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs

Add code
Oct 15, 2024
Figure 1 for Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Figure 2 for Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Figure 3 for Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Figure 4 for Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Viaarxiv icon

RMB: Comprehensively Benchmarking Reward Models in LLM Alignment

Add code
Oct 13, 2024
Figure 1 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Figure 2 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Figure 3 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Figure 4 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Viaarxiv icon

AI-Press: A Multi-Agent News Generating and Feedback Simulation System Powered by Large Language Models

Add code
Oct 10, 2024
Figure 1 for AI-Press: A Multi-Agent News Generating and Feedback Simulation System Powered by Large Language Models
Figure 2 for AI-Press: A Multi-Agent News Generating and Feedback Simulation System Powered by Large Language Models
Figure 3 for AI-Press: A Multi-Agent News Generating and Feedback Simulation System Powered by Large Language Models
Figure 4 for AI-Press: A Multi-Agent News Generating and Feedback Simulation System Powered by Large Language Models
Viaarxiv icon

Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing

Add code
Sep 25, 2024
Figure 1 for Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing
Figure 2 for Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing
Figure 3 for Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing
Figure 4 for Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing
Viaarxiv icon