Abstract:The rise of large language models has led to significant performance breakthroughs in named entity recognition (NER) for high-resource languages, yet there remains substantial room for improvement in low- and medium-resource languages. Existing multilingual NER methods face severe language interference during the multi-language adaptation process, manifested in feature conflicts between different languages and the competitive suppression of low-resource language features by high-resource languages. Although training a dedicated model for each language can mitigate such interference, it lacks scalability and incurs excessive computational costs in real-world applications. To address this issue, we propose RetrieveAll, a universal multilingual NER framework based on dynamic LoRA. The framework decouples task-specific features across languages and demonstrates efficient dynamic adaptability. Furthermore, we introduce a cross-granularity knowledge augmented method that fully exploits the intrinsic potential of the data without relying on external resources. By leveraging a hierarchical prompting mechanism to guide knowledge injection, this approach advances the paradigm from "prompt-guided inference" to "prompt-driven learning." Experimental results show that RetrieveAll outperforms existing baselines; on the PAN-X dataset, it achieves an average F1 improvement of 12.1 percent.
Abstract:Multi-level Tibetan spelling correction addresses errors at both the character and syllable levels within a unified model. Existing methods focus mainly on single-level correction and lack effective integration of both levels. Moreover, there are no open-source datasets or augmentation methods tailored for this task in Tibetan. To tackle this, we propose a data augmentation approach using unlabeled text to generate multi-level corruptions, and introduce TiSpell, a semi-masked model capable of correcting both character- and syllable-level errors. Although syllable-level correction is more challenging due to its reliance on global context, our semi-masked strategy simplifies this process. We synthesize nine types of corruptions on clean sentences to create a robust training set. Experiments on both simulated and real-world data demonstrate that TiSpell, trained on our dataset, outperforms baseline models and matches the performance of state-of-the-art approaches, confirming its effectiveness.
Abstract:Large language models (LLMs) have made tremendous progress in recent years, but low-resource languages, such as Tibetan, remain significantly underrepresented in their evaluation. Despite Tibetan being spoken by over seven million people, it has largely been neglected in the development and assessment of LLMs. To address this gap, we present TLUE (A Tibetan Language Understanding Evaluation Benchmark), the first large-scale benchmark for assessing LLMs' capabilities in Tibetan. TLUE comprises two major components: (1) a comprehensive multi-task understanding benchmark spanning 5 domains and 67 subdomains, and (2) a safety benchmark covering 7 subdomains. We evaluate a diverse set of state-of-the-art LLMs. Experimental results demonstrate that most LLMs perform below the random baseline, highlighting the considerable challenges LLMs face in processing Tibetan, a low-resource language. TLUE provides an essential foundation for driving future research and progress in Tibetan language understanding and underscores the need for greater inclusivity in LLM development.