Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhongju Wang

3DXTalker: Unifying Identity, Lip Sync, Emotion, and Spatial Dynamics in Expressive 3D Talking Avatars

Feb 12, 2026

Zhongju Wang, Zhenhong Sun, Beier Wang, Yifu Wang, Daoyi Dong, Huadong Mo, Hongdong Li

Abstract:Audio-driven 3D talking avatar generation is increasingly important in virtual communication, digital humans, and interactive media, where avatars must preserve identity, synchronize lip motion with speech, express emotion, and exhibit lifelike spatial dynamics, collectively defining a broader objective of expressivity. However, achieving this remains challenging due to insufficient training data with limited subject identities, narrow audio representations, and restricted explicit controllability. In this paper, we propose 3DXTalker, an expressive 3D talking avatar through data-curated identity modeling, audio-rich representations, and spatial dynamics controllability. 3DXTalker enables scalable identity modeling via 2D-to-3D data curation pipeline and disentangled representations, alleviating data scarcity and improving identity generalization. Then, we introduce frame-wise amplitude and emotional cues beyond standard speech embeddings, ensuring superior lip synchronization and nuanced expression modulation. These cues are unified by a flow-matching-based transformer for coherent facial dynamics. Moreover, 3DXTalker also enables natural head-pose motion generation while supporting stylized control via prompt-based conditioning. Extensive experiments show that 3DXTalker integrates lip synchronization, emotional expression, and head-pose dynamics within a unified framework, achieves superior performance in 3D talking avatar generation.

Via

Access Paper or Ask Questions

BERT-based Chinese Text Classification for Emergency Domain with a Novel Loss Function

Apr 09, 2021

Zhongju Wang, Long Wang, Chao Huang, Xiong Luo

Figure 1 for BERT-based Chinese Text Classification for Emergency Domain with a Novel Loss Function

Figure 2 for BERT-based Chinese Text Classification for Emergency Domain with a Novel Loss Function

Figure 3 for BERT-based Chinese Text Classification for Emergency Domain with a Novel Loss Function

Figure 4 for BERT-based Chinese Text Classification for Emergency Domain with a Novel Loss Function

Abstract:This paper proposes an automatic Chinese text categorization method for solving the emergency event report classification problem. Since bidirectional encoder representations from transformers (BERT) has achieved great success in natural language processing domain, it is employed to derive emergency text features in this study. To overcome the data imbalance problem in the distribution of emergency event categories, a novel loss function is proposed to improve the performance of the BERT-based model. Meanwhile, to avoid the impact of the extreme learning rate, the Adabound optimization algorithm that achieves a gradual smooth transition from Adam to SGD is employed to learn parameters of the model. To verify the feasibility and effectiveness of the proposed method, a Chinese emergency text dataset collected from the Internet is employed. Compared with benchmarking methods, the proposed method has achieved the best performance in terms of accuracy, weighted-precision, weighted-recall, and weighted-F1 values. Therefore, it is promising to employ the proposed method for real applications in smart emergency management systems.

* 16 pages, 6 figures

Via

Access Paper or Ask Questions