Picture for Jun Wang

Jun Wang

IBM T. J. Watson Research Center

ALISE: Accelerating Large Language Model Serving with Speculative Scheduling

Add code
Oct 31, 2024
Viaarxiv icon

UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function

Add code
Oct 28, 2024
Figure 1 for UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function
Figure 2 for UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function
Figure 3 for UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function
Figure 4 for UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function
Viaarxiv icon

Lightweight Neural App Control

Add code
Oct 23, 2024
Figure 1 for Lightweight Neural App Control
Figure 2 for Lightweight Neural App Control
Figure 3 for Lightweight Neural App Control
Figure 4 for Lightweight Neural App Control
Viaarxiv icon

Elucidating the design space of language models for image generation

Add code
Oct 21, 2024
Figure 1 for Elucidating the design space of language models for image generation
Figure 2 for Elucidating the design space of language models for image generation
Figure 3 for Elucidating the design space of language models for image generation
Figure 4 for Elucidating the design space of language models for image generation
Viaarxiv icon

SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation

Add code
Oct 19, 2024
Figure 1 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 2 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 3 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 4 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Viaarxiv icon

DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents

Add code
Oct 18, 2024
Figure 1 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Figure 2 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Figure 3 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Figure 4 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Viaarxiv icon

FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection

Add code
Oct 15, 2024
Figure 1 for FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection
Figure 2 for FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection
Figure 3 for FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection
Figure 4 for FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection
Viaarxiv icon

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Add code
Oct 12, 2024
Figure 1 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Figure 2 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Figure 3 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Figure 4 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Viaarxiv icon

Efficient Reinforcement Learning with Large Language Model Priors

Add code
Oct 10, 2024
Figure 1 for Efficient Reinforcement Learning with Large Language Model Priors
Figure 2 for Efficient Reinforcement Learning with Large Language Model Priors
Figure 3 for Efficient Reinforcement Learning with Large Language Model Priors
Figure 4 for Efficient Reinforcement Learning with Large Language Model Priors
Viaarxiv icon

Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization

Add code
Oct 08, 2024
Figure 1 for Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization
Figure 2 for Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization
Figure 3 for Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization
Figure 4 for Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization
Viaarxiv icon