Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Siyi Zhou

Information Suppression in Large Language Models: Auditing, Quantifying, and Characterizing Censorship in DeepSeek

Jun 14, 2025

Peiran Qiu, Siyi Zhou, Emilio Ferrara

Abstract:This study examines information suppression mechanisms in DeepSeek, an open-source large language model (LLM) developed in China. We propose an auditing framework and use it to analyze the model's responses to 646 politically sensitive prompts by comparing its final output with intermediate chain-of-thought (CoT) reasoning. Our audit unveils evidence of semantic-level information suppression in DeepSeek: sensitive content often appears within the model's internal reasoning but is omitted or rephrased in the final output. Specifically, DeepSeek suppresses references to transparency, government accountability, and civic mobilization, while occasionally amplifying language aligned with state propaganda. This study underscores the need for systematic auditing of alignment, content moderation, information suppression, and censorship practices implemented into widely-adopted AI models, to ensure transparency, accountability, and equitable access to unbiased information obtained by means of these systems.

Via

Access Paper or Ask Questions

IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Feb 08, 2025

Wei Deng, Siyi Zhou, Jingchen Shu, Jinchao Wang, Lu Wang

Figure 1 for IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Figure 2 for IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Figure 3 for IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Figure 4 for IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Abstract:Recently, large language model (LLM) based text-to-speech (TTS) systems have gradually become the mainstream in the industry due to their high naturalness and powerful zero-shot voice cloning capabilities.Here, we introduce the IndexTTS system, which is mainly based on the XTTS and Tortoise model. We add some novel improvements. Specifically, in Chinese scenarios, we adopt a hybrid modeling method that combines characters and pinyin, making the pronunciations of polyphonic characters and long-tail characters controllable. We also performed a comparative analysis of the Vector Quantization (VQ) with Finite-Scalar Quantization (FSQ) for codebook utilization of acoustic speech tokens. To further enhance the effect and stability of voice cloning, we introduce a conformer-based speech conditional encoder and replace the speechcode decoder with BigVGAN2. Compared with XTTS, it has achieved significant improvements in naturalness, content consistency, and zero-shot voice cloning. As for the popular TTS systems in the open-source, such as Fish-Speech, CosyVoice2, FireRedTTS and F5-TTS, IndexTTS has a relatively simple training process, more controllable usage, and faster inference speed. Moreover, its performance surpasses that of these systems. Our demos are available at https://index-tts.github.io.

Via

Access Paper or Ask Questions