Alert button
Picture for Ruihua Song

Ruihua Song

Alert button

SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition

Jan 31, 2024
Yihan Wu, Soumi Maiti, Yifan Peng, Wangyou Zhang, Chenda Li, Yuyue Wang, Xihua Wang, Shinji Watanabe, Ruihua Song

Viaarxiv icon

What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning

Nov 02, 2023
Yifan Du, Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Jinpeng Wang, Chuyuan Wang, Mingchen Cai, Ruihua Song, Ji-Rong Wen

Figure 1 for What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
Figure 2 for What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
Figure 3 for What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
Figure 4 for What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
Viaarxiv icon

Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions

Oct 11, 2023
Yuchong Sun, Che Liu, Jinwen Huang, Ruihua Song, Fuzheng Zhang, Di Zhang, Zhongyuan Wang, Kun Gai

Figure 1 for Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions
Figure 2 for Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions
Figure 3 for Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions
Figure 4 for Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions
Viaarxiv icon

ViCo: Engaging Video Comment Generation with Human Preference Rewards

Aug 22, 2023
Yuchong Sun, Bei Liu, Xu Chen, Ruihua Song, Jianlong Fu

Figure 1 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Figure 2 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Figure 3 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Figure 4 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Viaarxiv icon

Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots

Jun 25, 2023
Jiange Yang, Wenhui Tan, Chuhao Jin, Bei Liu, Jianlong Fu, Ruihua Song, Limin Wang

Figure 1 for Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots
Figure 2 for Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots
Figure 3 for Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots
Figure 4 for Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots
Viaarxiv icon

RecAgent: A Novel Simulation Paradigm for Recommender Systems

Jun 05, 2023
Lei Wang, Jingsen Zhang, Xu Chen, Yankai Lin, Ruihua Song, Wayne Xin Zhao, Ji-Rong Wen

Figure 1 for RecAgent: A Novel Simulation Paradigm for Recommender Systems
Figure 2 for RecAgent: A Novel Simulation Paradigm for Recommender Systems
Viaarxiv icon

AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation

May 30, 2023
Chuhao Jin, Wenhui Tan, Jiange Yang, Bei Liu, Ruihua Song, Limin Wang, Jianlong Fu

Figure 1 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 2 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 3 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 4 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Viaarxiv icon

ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios

May 20, 2023
Yuyue Wang, Huan Xiao, Yihan Wu, Ruihua Song

Figure 1 for ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
Figure 2 for ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
Figure 3 for ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
Figure 4 for ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
Viaarxiv icon

TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

Jan 14, 2023
Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu

Figure 1 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 2 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 3 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 4 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Viaarxiv icon