Picture for Hao Sun

Hao Sun

PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization

Add code
Dec 19, 2024
Figure 1 for PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization
Figure 2 for PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization
Figure 3 for PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization
Figure 4 for PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization
Viaarxiv icon

Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes

Add code
Dec 18, 2024
Figure 1 for Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes
Figure 2 for Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes
Figure 3 for Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes
Figure 4 for Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes
Viaarxiv icon

Real-time Identity Defenses against Malicious Personalization of Diffusion Models

Add code
Dec 13, 2024
Figure 1 for Real-time Identity Defenses against Malicious Personalization of Diffusion Models
Figure 2 for Real-time Identity Defenses against Malicious Personalization of Diffusion Models
Figure 3 for Real-time Identity Defenses against Malicious Personalization of Diffusion Models
Figure 4 for Real-time Identity Defenses against Malicious Personalization of Diffusion Models
Viaarxiv icon

LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements

Add code
Dec 09, 2024
Figure 1 for LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements
Figure 2 for LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements
Figure 3 for LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements
Figure 4 for LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements
Viaarxiv icon

Constructing optimal treatment length strategies to maximize quality-adjusted lifetimes

Add code
Dec 06, 2024
Viaarxiv icon

Detailed Object Description with Controllable Dimensions

Add code
Nov 28, 2024
Viaarxiv icon

StreamAdapter: Efficient Test Time Adaptation from Contextual Streams

Add code
Nov 14, 2024
Figure 1 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 2 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 3 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 4 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Viaarxiv icon

Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation

Add code
Nov 10, 2024
Figure 1 for Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
Figure 2 for Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
Figure 3 for Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
Figure 4 for Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
Viaarxiv icon

Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives

Add code
Nov 07, 2024
Figure 1 for Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives
Figure 2 for Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives
Figure 3 for Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives
Figure 4 for Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives
Viaarxiv icon

Token-level Proximal Policy Optimization for Query Generation

Add code
Nov 01, 2024
Viaarxiv icon