Picture for Yao Dou

Yao Dou

Evaluating LLMs on Chinese Idiom Translation

Add code
Aug 14, 2025
Viaarxiv icon

Measuring, Modeling, and Helping People Account for Privacy Risks in Online Self-Disclosures with AI

Add code
Dec 19, 2024
Viaarxiv icon

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models

Add code
Oct 15, 2024
Figure 1 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 2 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 3 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 4 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Viaarxiv icon

Improving Minimum Bayes Risk Decoding with Multi-Prompt

Add code
Jul 22, 2024
Figure 1 for Improving Minimum Bayes Risk Decoding with Multi-Prompt
Figure 2 for Improving Minimum Bayes Risk Decoding with Multi-Prompt
Figure 3 for Improving Minimum Bayes Risk Decoding with Multi-Prompt
Figure 4 for Improving Minimum Bayes Risk Decoding with Multi-Prompt
Viaarxiv icon

GPT-4 Jailbreaks Itself with Near-Perfect Success Using Self-Explanation

Add code
May 21, 2024
Viaarxiv icon

Reducing Privacy Risks in Online Self-Disclosures with Language Models

Add code
Nov 16, 2023
Viaarxiv icon

Automatic and Human-AI Interactive Text Generation

Add code
Oct 05, 2023
Viaarxiv icon

Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation

Add code
Aug 15, 2023
Figure 1 for Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation
Figure 2 for Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation
Figure 3 for Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation
Figure 4 for Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation
Viaarxiv icon

Dancing Between Success and Failure: Edit-level Simplification Evaluation using SALSA

Add code
May 23, 2023
Figure 1 for Dancing Between Success and Failure: Edit-level Simplification Evaluation using SALSA
Figure 2 for Dancing Between Success and Failure: Edit-level Simplification Evaluation using SALSA
Figure 3 for Dancing Between Success and Failure: Edit-level Simplification Evaluation using SALSA
Figure 4 for Dancing Between Success and Failure: Edit-level Simplification Evaluation using SALSA
Viaarxiv icon

LENS: A Learnable Evaluation Metric for Text Simplification

Add code
Dec 19, 2022
Figure 1 for LENS: A Learnable Evaluation Metric for Text Simplification
Figure 2 for LENS: A Learnable Evaluation Metric for Text Simplification
Figure 3 for LENS: A Learnable Evaluation Metric for Text Simplification
Figure 4 for LENS: A Learnable Evaluation Metric for Text Simplification
Viaarxiv icon