Picture for Jeff Schneider

Jeff Schneider

Carnegie Mellon University

Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts

Add code
Sep 02, 2024
Viaarxiv icon

Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization

Add code
Aug 08, 2024
Figure 1 for Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization
Figure 2 for Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization
Figure 3 for Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization
Figure 4 for Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization
Viaarxiv icon

Soft-QMIX: Integrating Maximum Entropy For Monotonic Value Function Factorization

Add code
Jun 20, 2024
Viaarxiv icon

Planning with Adaptive World Models for Autonomous Driving

Add code
Jun 15, 2024
Viaarxiv icon

What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions

Add code
May 22, 2024
Figure 1 for What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Figure 2 for What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Figure 3 for What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Figure 4 for What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Viaarxiv icon

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Add code
Apr 23, 2024
Figure 1 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 2 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 3 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 4 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Viaarxiv icon

Full Shot Predictions for the DIII-D Tokamak via Deep Recurrent Networks

Add code
Apr 18, 2024
Figure 1 for Full Shot Predictions for the DIII-D Tokamak via Deep Recurrent Networks
Figure 2 for Full Shot Predictions for the DIII-D Tokamak via Deep Recurrent Networks
Figure 3 for Full Shot Predictions for the DIII-D Tokamak via Deep Recurrent Networks
Figure 4 for Full Shot Predictions for the DIII-D Tokamak via Deep Recurrent Networks
Viaarxiv icon

Tractable Joint Prediction and Planning over Discrete Behavior Modes for Urban Driving

Add code
Mar 12, 2024
Figure 1 for Tractable Joint Prediction and Planning over Discrete Behavior Modes for Urban Driving
Figure 2 for Tractable Joint Prediction and Planning over Discrete Behavior Modes for Urban Driving
Figure 3 for Tractable Joint Prediction and Planning over Discrete Behavior Modes for Urban Driving
Figure 4 for Tractable Joint Prediction and Planning over Discrete Behavior Modes for Urban Driving
Viaarxiv icon

Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following

Add code
Feb 09, 2024
Figure 1 for Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following
Figure 2 for Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following
Figure 3 for Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following
Figure 4 for Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following
Viaarxiv icon

Decentralized Multi-Agent Active Search and Tracking when Targets Outnumber Agents

Add code
Jan 09, 2024
Figure 1 for Decentralized Multi-Agent Active Search and Tracking when Targets Outnumber Agents
Figure 2 for Decentralized Multi-Agent Active Search and Tracking when Targets Outnumber Agents
Figure 3 for Decentralized Multi-Agent Active Search and Tracking when Targets Outnumber Agents
Figure 4 for Decentralized Multi-Agent Active Search and Tracking when Targets Outnumber Agents
Viaarxiv icon