Multilingual Text Classification


Multilingual text classification is the process of categorizing text documents in multiple languages into predefined categories.

Reasoning under Ambiguity: Uncertainty-Aware Multilingual Emotion Classification under Partial Supervision

Add code
Feb 05, 2026
Viaarxiv icon

Multilingual Extraction and Recognition of Implicit Discourse Relations in Speech and Text

Add code
Feb 04, 2026
Viaarxiv icon

MOSLD-Bench: Multilingual Open-Set Learning and Discovery Benchmark for Text Categorization

Add code
Jan 19, 2026
Viaarxiv icon

BYOL: Bring Your Own Language Into LLMs

Add code
Jan 15, 2026
Viaarxiv icon

VoxCog: Towards End-to-End Multilingual Cognitive Impairment Classification through Dialectal Knowledge

Add code
Jan 12, 2026
Viaarxiv icon

Qalb: Largest State-of-the-Art Urdu Large Language Model for 230M Speakers with Systematic Continued Pre-training

Add code
Jan 13, 2026
Viaarxiv icon

X-MuTeST: A Multilingual Benchmark for Explainable Hate Speech Detection and A Novel LLM-consulted Explanation Framework

Add code
Jan 06, 2026
Viaarxiv icon

Low-Resource, High-Impact: Building Corpora for Inclusive Language Technologies

Add code
Dec 16, 2025
Viaarxiv icon

Correcting Mean Bias in Text Embeddings: A Refined Renormalization with Training-Free Improvements on MMTEB

Add code
Nov 14, 2025
Figure 1 for Correcting Mean Bias in Text Embeddings: A Refined Renormalization with Training-Free Improvements on MMTEB
Figure 2 for Correcting Mean Bias in Text Embeddings: A Refined Renormalization with Training-Free Improvements on MMTEB
Figure 3 for Correcting Mean Bias in Text Embeddings: A Refined Renormalization with Training-Free Improvements on MMTEB
Figure 4 for Correcting Mean Bias in Text Embeddings: A Refined Renormalization with Training-Free Improvements on MMTEB
Viaarxiv icon

Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks

Add code
Nov 10, 2025
Viaarxiv icon