Picture for Yiming Yang

Yiming Yang

Kuaishou Technology

A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest

Add code
Nov 17, 2023
Figure 1 for A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest
Figure 2 for A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest
Figure 3 for A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest
Figure 4 for A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest
Viaarxiv icon

AutoMix: Automatically Mixing Language Models

Add code
Oct 19, 2023
Figure 1 for AutoMix: Automatically Mixing Language Models
Figure 2 for AutoMix: Automatically Mixing Language Models
Figure 3 for AutoMix: Automatically Mixing Language Models
Figure 4 for AutoMix: Automatically Mixing Language Models
Viaarxiv icon

SALMON: Self-Alignment with Principle-Following Reward Models

Add code
Oct 09, 2023
Figure 1 for SALMON: Self-Alignment with Principle-Following Reward Models
Figure 2 for SALMON: Self-Alignment with Principle-Following Reward Models
Figure 3 for SALMON: Self-Alignment with Principle-Following Reward Models
Figure 4 for SALMON: Self-Alignment with Principle-Following Reward Models
Viaarxiv icon

Functional Interpolation for Relative Positions Improves Long Context Transformers

Add code
Oct 06, 2023
Viaarxiv icon

Aligning Large Multimodal Models with Factually Augmented RLHF

Add code
Sep 25, 2023
Viaarxiv icon

Accelerating Diffusion-based Combinatorial Optimization Solvers by Progressive Distillation

Add code
Aug 22, 2023
Viaarxiv icon

Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge Distillation

Add code
Aug 07, 2023
Viaarxiv icon

Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs

Add code
Jul 22, 2023
Figure 1 for Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs
Figure 2 for Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs
Figure 3 for Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs
Figure 4 for Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs
Viaarxiv icon

PESCO: Prompt-enhanced Self Contrastive Learning for Zero-shot Text Classification

Add code
May 24, 2023
Viaarxiv icon

Policy Representation via Diffusion Probability Model for Reinforcement Learning

Add code
May 22, 2023
Viaarxiv icon