Picture for Farhan Samir

Farhan Samir

WikiGap: Promoting Epistemic Equity by Surfacing Knowledge Gaps Between English Wikipedia and other Language Editions

Add code
May 30, 2025
Viaarxiv icon

ZIPA: A family of efficient models for multilingual phone recognition

Add code
May 29, 2025
Viaarxiv icon

Is It Bad to Work All the Time? Cross-Cultural Evaluation of Social Norm Biases in GPT-4

Add code
May 23, 2025
Viaarxiv icon

Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia

Add code
Oct 05, 2024
Figure 1 for Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia
Figure 2 for Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia
Figure 3 for Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia
Figure 4 for Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia
Viaarxiv icon

Efficiently Identifying Low-Quality Language Subsets in Multilingual Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset

Add code
Oct 05, 2024
Figure 1 for Efficiently Identifying Low-Quality Language Subsets in Multilingual Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset
Figure 2 for Efficiently Identifying Low-Quality Language Subsets in Multilingual Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset
Figure 3 for Efficiently Identifying Low-Quality Language Subsets in Multilingual Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset
Figure 4 for Efficiently Identifying Low-Quality Language Subsets in Multilingual Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset
Viaarxiv icon

Open-vocabulary keyword spotting in any language through multilingual contrastive speech-phoneme pretraining

Add code
Nov 14, 2023
Figure 1 for Open-vocabulary keyword spotting in any language through multilingual contrastive speech-phoneme pretraining
Figure 2 for Open-vocabulary keyword spotting in any language through multilingual contrastive speech-phoneme pretraining
Figure 3 for Open-vocabulary keyword spotting in any language through multilingual contrastive speech-phoneme pretraining
Figure 4 for Open-vocabulary keyword spotting in any language through multilingual contrastive speech-phoneme pretraining
Viaarxiv icon

Understanding compositional data augmentation in automatic morphological inflection

Add code
May 23, 2023
Viaarxiv icon

Dim Wihl Gat Tun: The Case for Linguistic Expertise in NLP for Underdocumented Languages

Add code
Mar 17, 2022
Figure 1 for Dim Wihl Gat Tun: The Case for Linguistic Expertise in NLP for Underdocumented Languages
Figure 2 for Dim Wihl Gat Tun: The Case for Linguistic Expertise in NLP for Underdocumented Languages
Viaarxiv icon

Quantifying Cognitive Factors in Lexical Decline

Add code
Oct 12, 2021
Figure 1 for Quantifying Cognitive Factors in Lexical Decline
Figure 2 for Quantifying Cognitive Factors in Lexical Decline
Figure 3 for Quantifying Cognitive Factors in Lexical Decline
Figure 4 for Quantifying Cognitive Factors in Lexical Decline
Viaarxiv icon