Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Joint decoding method for controllable contextual speech recognition based on Speech LLM

Aug 12, 2025

Yangui Fang, Jing Peng, Yu Xi, Xu Li, Haoyu Li, Chengwei Zhang, Guohui Zhong, Kai Yu

Figure 1 for Joint decoding method for controllable contextual speech recognition based on Speech LLM

Figure 2 for Joint decoding method for controllable contextual speech recognition based on Speech LLM

Figure 3 for Joint decoding method for controllable contextual speech recognition based on Speech LLM

Figure 4 for Joint decoding method for controllable contextual speech recognition based on Speech LLM

Share this with someone who'll enjoy it:

Abstract:Contextual speech recognition refers to the ability to identify preferences for specific content based on contextual information. Recently, leveraging the contextual understanding capabilities of Speech LLM to achieve contextual biasing by injecting contextual information through prompts have emerged as a research hotspot.However, the direct information injection method via prompts relies on the internal attention mechanism of the model, making it impossible to explicitly control the extent of information injection. To address this limitation, we propose a joint decoding method to control the contextual information. This approach enables explicit control over the injected contextual information and achieving superior recognition performance. Additionally, Our method can also be used for sensitive word suppression recognition.Furthermore, experimental results show that even Speech LLM not pre-trained on long contextual data can acquire long contextual capabilities through our method.

View paper on

Share this with someone who'll enjoy it:

Title:Joint decoding method for controllable contextual speech recognition based on Speech LLM

Paper and Code