Picture for Jiayuan Ma

Jiayuan Ma

Scaling Reinforcement Learning for Content Moderation with Large Language Models

Add code
Dec 23, 2025
Figure 1 for Scaling Reinforcement Learning for Content Moderation with Large Language Models
Figure 2 for Scaling Reinforcement Learning for Content Moderation with Large Language Models
Figure 3 for Scaling Reinforcement Learning for Content Moderation with Large Language Models
Figure 4 for Scaling Reinforcement Learning for Content Moderation with Large Language Models
Viaarxiv icon

Lost in Pronunciation: Detecting Chinese Offensive Language Disguised by Phonetic Cloaking Replacement

Add code
Jul 10, 2025
Viaarxiv icon

Detecting Conversational Mental Manipulation with Intent-Aware Prompting

Add code
Dec 11, 2024
Figure 1 for Detecting Conversational Mental Manipulation with Intent-Aware Prompting
Figure 2 for Detecting Conversational Mental Manipulation with Intent-Aware Prompting
Figure 3 for Detecting Conversational Mental Manipulation with Intent-Aware Prompting
Figure 4 for Detecting Conversational Mental Manipulation with Intent-Aware Prompting
Viaarxiv icon

Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms

Add code
Nov 11, 2022
Figure 1 for Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms
Figure 2 for Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms
Figure 3 for Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms
Viaarxiv icon