Picture for Haiyue Song

Haiyue Song

OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training

Add code
Mar 30, 2026
Viaarxiv icon

Is Human Annotation Necessary? Iterative MBR Distillation for Error Span Detection in Machine Translation

Add code
Mar 16, 2026
Viaarxiv icon

Minimum Bayes Risk Decoding for Error Span Detection in Reference-Free Automatic Machine Translation Evaluation

Add code
Dec 19, 2025
Figure 1 for Minimum Bayes Risk Decoding for Error Span Detection in Reference-Free Automatic Machine Translation Evaluation
Figure 2 for Minimum Bayes Risk Decoding for Error Span Detection in Reference-Free Automatic Machine Translation Evaluation
Figure 3 for Minimum Bayes Risk Decoding for Error Span Detection in Reference-Free Automatic Machine Translation Evaluation
Figure 4 for Minimum Bayes Risk Decoding for Error Span Detection in Reference-Free Automatic Machine Translation Evaluation
Viaarxiv icon

PrahokBART: A Pre-trained Sequence-to-Sequence Model for Khmer Natural Language Generation

Add code
Dec 15, 2025
Viaarxiv icon

CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation

Add code
May 30, 2025
Figure 1 for CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation
Figure 2 for CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation
Figure 3 for CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation
Figure 4 for CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation
Viaarxiv icon

IteRABRe: Iterative Recovery-Aided Block Reduction

Add code
Mar 08, 2025
Figure 1 for IteRABRe: Iterative Recovery-Aided Block Reduction
Figure 2 for IteRABRe: Iterative Recovery-Aided Block Reduction
Figure 3 for IteRABRe: Iterative Recovery-Aided Block Reduction
Figure 4 for IteRABRe: Iterative Recovery-Aided Block Reduction
Viaarxiv icon

Pralekha: An Indic Document Alignment Evaluation Benchmark

Add code
Nov 28, 2024
Figure 1 for Pralekha: An Indic Document Alignment Evaluation Benchmark
Figure 2 for Pralekha: An Indic Document Alignment Evaluation Benchmark
Figure 3 for Pralekha: An Indic Document Alignment Evaluation Benchmark
Figure 4 for Pralekha: An Indic Document Alignment Evaluation Benchmark
Viaarxiv icon

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark

Add code
Jun 10, 2024
Figure 1 for CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
Figure 2 for CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
Figure 3 for CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
Figure 4 for CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
Viaarxiv icon

Enhancing Personality Recognition in Dialogue by Data Augmentation and Heterogeneous Conversational Graph Networks

Add code
Jan 11, 2024
Figure 1 for Enhancing Personality Recognition in Dialogue by Data Augmentation and Heterogeneous Conversational Graph Networks
Figure 2 for Enhancing Personality Recognition in Dialogue by Data Augmentation and Heterogeneous Conversational Graph Networks
Figure 3 for Enhancing Personality Recognition in Dialogue by Data Augmentation and Heterogeneous Conversational Graph Networks
Figure 4 for Enhancing Personality Recognition in Dialogue by Data Augmentation and Heterogeneous Conversational Graph Networks
Viaarxiv icon

Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts

Add code
Nov 07, 2023
Figure 1 for Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts
Figure 2 for Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts
Figure 3 for Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts
Figure 4 for Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts
Viaarxiv icon