Alert button
Picture for Ruihua Song

Ruihua Song

Alert button

Multi-task Manipulation Policy Modeling with Visuomotor Latent Diffusion

Add code
Bookmark button
Alert button
Mar 12, 2024
Wenhui Tan, Bei Liu, Junbo Zhang, Ruihua Song, Jianlong Fu

Figure 1 for Multi-task Manipulation Policy Modeling with Visuomotor Latent Diffusion
Figure 2 for Multi-task Manipulation Policy Modeling with Visuomotor Latent Diffusion
Figure 3 for Multi-task Manipulation Policy Modeling with Visuomotor Latent Diffusion
Figure 4 for Multi-task Manipulation Policy Modeling with Visuomotor Latent Diffusion
Viaarxiv icon

SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition

Add code
Bookmark button
Alert button
Jan 31, 2024
Yihan Wu, Soumi Maiti, Yifan Peng, Wangyou Zhang, Chenda Li, Yuyue Wang, Xihua Wang, Shinji Watanabe, Ruihua Song

Viaarxiv icon

What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning

Add code
Bookmark button
Alert button
Nov 02, 2023
Yifan Du, Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Jinpeng Wang, Chuyuan Wang, Mingchen Cai, Ruihua Song, Ji-Rong Wen

Figure 1 for What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
Figure 2 for What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
Figure 3 for What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
Figure 4 for What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
Viaarxiv icon

Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions

Add code
Bookmark button
Alert button
Oct 11, 2023
Yuchong Sun, Che Liu, Jinwen Huang, Ruihua Song, Fuzheng Zhang, Di Zhang, Zhongyuan Wang, Kun Gai

Figure 1 for Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions
Figure 2 for Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions
Figure 3 for Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions
Figure 4 for Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions
Viaarxiv icon

ViCo: Engaging Video Comment Generation with Human Preference Rewards

Add code
Bookmark button
Alert button
Aug 22, 2023
Yuchong Sun, Bei Liu, Xu Chen, Ruihua Song, Jianlong Fu

Figure 1 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Figure 2 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Figure 3 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Figure 4 for ViCo: Engaging Video Comment Generation with Human Preference Rewards
Viaarxiv icon

Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots

Add code
Bookmark button
Alert button
Jun 25, 2023
Jiange Yang, Wenhui Tan, Chuhao Jin, Bei Liu, Jianlong Fu, Ruihua Song, Limin Wang

Figure 1 for Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots
Figure 2 for Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots
Figure 3 for Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots
Figure 4 for Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots
Viaarxiv icon

RecAgent: A Novel Simulation Paradigm for Recommender Systems

Add code
Bookmark button
Alert button
Jun 05, 2023
Lei Wang, Jingsen Zhang, Xu Chen, Yankai Lin, Ruihua Song, Wayne Xin Zhao, Ji-Rong Wen

Figure 1 for RecAgent: A Novel Simulation Paradigm for Recommender Systems
Figure 2 for RecAgent: A Novel Simulation Paradigm for Recommender Systems
Viaarxiv icon

AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation

Add code
Bookmark button
Alert button
May 30, 2023
Chuhao Jin, Wenhui Tan, Jiange Yang, Bei Liu, Ruihua Song, Limin Wang, Jianlong Fu

Figure 1 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 2 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 3 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Figure 4 for AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation
Viaarxiv icon

ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios

Add code
Bookmark button
Alert button
May 20, 2023
Yuyue Wang, Huan Xiao, Yihan Wu, Ruihua Song

Figure 1 for ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
Figure 2 for ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
Figure 3 for ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
Figure 4 for ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
Viaarxiv icon