Alert button
Picture for Zhengfu He

Zhengfu He

Alert button

Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT

Add code
Bookmark button
Alert button
Feb 19, 2024
Zhengfu He, Xuyang Ge, Qiong Tang, Tianxiang Sun, Qinyuan Cheng, Xipeng Qiu

Viaarxiv icon

Can AI Assistants Know What They Don't Know?

Add code
Bookmark button
Alert button
Jan 28, 2024
Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu, Wenwei Zhang, Zhangyue Yin, Shimin Li, Linyang Li, Zhengfu He, Kai Chen, Xipeng Qiu

Viaarxiv icon

DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models

Add code
Bookmark button
Alert button
Nov 30, 2022
Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, Xipeng Qiu

Figure 1 for DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models
Figure 2 for DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models
Figure 3 for DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models
Figure 4 for DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models
Viaarxiv icon

Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning

Add code
Bookmark button
Alert button
Oct 14, 2022
Tianxiang Sun, Zhengfu He, Qin Zhu, Xipeng Qiu, Xuanjing Huang

Figure 1 for Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning
Figure 2 for Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning
Figure 3 for Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning
Figure 4 for Multi-Task Pre-Training of Modular Prompt for Few-Shot Learning
Viaarxiv icon

BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning

Add code
Bookmark button
Alert button
May 23, 2022
Tianxiang Sun, Zhengfu He, Hong Qian, Xuanjing Huang, Xipeng Qiu

Figure 1 for BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning
Figure 2 for BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning
Figure 3 for BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning
Figure 4 for BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning
Viaarxiv icon