Alert button
Picture for Zhipeng Yu

Zhipeng Yu

Alert button

A Small and Fast BERT for Chinese Medical Punctuation Restoration

Aug 24, 2023
Tongtao Ling, Chen Liao, Zhipeng Yu, Lei Chen, Shilei Huang, Yi Liu

Figure 1 for A Small and Fast BERT for Chinese Medical Punctuation Restoration
Figure 2 for A Small and Fast BERT for Chinese Medical Punctuation Restoration
Figure 3 for A Small and Fast BERT for Chinese Medical Punctuation Restoration
Figure 4 for A Small and Fast BERT for Chinese Medical Punctuation Restoration

In clinical dictation, utterances after automatic speech recognition (ASR) without explicit punctuation marks may lead to the misunderstanding of dictated reports. To give a precise and understandable clinical report with ASR, automatic punctuation restoration is required. Considering a practical scenario, we propose a fast and light pre-trained model for Chinese medical punctuation restoration based on 'pretraining and fine-tuning' paradigm. In this work, we distill pre-trained models by incorporating supervised contrastive learning and a novel auxiliary pre-training task (Punctuation Mark Prediction) to make it well-suited for punctuation restoration. Our experiments on various distilled models reveal that our model can achieve 95% performance while 10% model size relative to state-of-the-art Chinese RoBERTa.

* 5 pages, 2 figures 
Viaarxiv icon

Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework

May 06, 2023
Ruijia Wu, Yuhang Wang, Huafeng Shi, Zhipeng Yu, Yichao Wu, Ding Liang

Figure 1 for Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework
Figure 2 for Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework
Figure 3 for Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework
Figure 4 for Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework

Denoising diffusion models have shown remarkable potential in various generation tasks. The open-source large-scale text-to-image model, Stable Diffusion, becomes prevalent as it can generate realistic artistic or facial images with personalization through fine-tuning on a limited number of new samples. However, this has raised privacy concerns as adversaries can acquire facial images online and fine-tune text-to-image models for malicious editing, leading to baseless scandals, defamation, and disruption to victims' lives. Prior research efforts have focused on deriving adversarial loss from conventional training processes for facial privacy protection through adversarial perturbations. However, existing algorithms face two issues: 1) they neglect the image-text fusion module, which is the vital module of text-to-image diffusion models, and 2) their defensive performance is unstable against different attacker prompts. In this paper, we propose the Adversarial Decoupling Augmentation Framework (ADAF), addressing these issues by targeting the image-text fusion module to enhance the defensive performance of facial privacy protection algorithms. ADAF introduces multi-level text-related augmentations for defense stability against various attacker prompts. Concretely, considering the vision, text, and common unit space, we propose Vision-Adversarial Loss, Prompt-Robust Augmentation, and Attention-Decoupling Loss. Extensive experiments on CelebA-HQ and VGGFace2 demonstrate ADAF's promising performance, surpassing existing algorithms.

* 8 pages, 6 figures 
Viaarxiv icon

HMDO: Markerless Multi-view Hand Manipulation Capture with Deformable Objects

Jan 18, 2023
Wei Xie, Zhipeng Yu, Zimeng Zhao, Binghui Zuo, Yangang Wang

Figure 1 for HMDO: Markerless Multi-view Hand Manipulation Capture with Deformable Objects
Figure 2 for HMDO: Markerless Multi-view Hand Manipulation Capture with Deformable Objects
Figure 3 for HMDO: Markerless Multi-view Hand Manipulation Capture with Deformable Objects
Figure 4 for HMDO: Markerless Multi-view Hand Manipulation Capture with Deformable Objects

We construct the first markerless deformable interaction dataset recording interactive motions of the hands and deformable objects, called HMDO (Hand Manipulation with Deformable Objects). With our built multi-view capture system, it captures the deformable interactions with multiple perspectives, various object shapes, and diverse interactive forms. Our motivation is the current lack of hand and deformable object interaction datasets, as 3D hand and deformable object reconstruction is challenging. Mainly due to mutual occlusion, the interaction area is difficult to observe, the visual features between the hand and the object are entangled, and the reconstruction of the interaction area deformation is difficult. To tackle this challenge, we propose a method to annotate our captured data. Our key idea is to collaborate with estimated hand features to guide the object global pose estimation, and then optimize the deformation process of the object by analyzing the relationship between the hand and the object. Through comprehensive evaluation, the proposed method can reconstruct interactive motions of hands and deformable objects with high quality. HMDO currently consists of 21600 frames over 12 sequences. In the future, this dataset could boost the research of learning-based reconstruction of deformable interaction scenes.

Viaarxiv icon

Speckle-based optical cryptosystem and its application for human face recognition via deep learning

Jan 26, 2022
Qi Zhao, Huanhao Li, Zhipeng Yu, Chi Man Woo, Tianting Zhong, Shengfu Cheng, Yuanjin Zheng, Honglin Liu, Jie Tian, Puxiang Lai

Figure 1 for Speckle-based optical cryptosystem and its application for human face recognition via deep learning
Figure 2 for Speckle-based optical cryptosystem and its application for human face recognition via deep learning
Figure 3 for Speckle-based optical cryptosystem and its application for human face recognition via deep learning
Figure 4 for Speckle-based optical cryptosystem and its application for human face recognition via deep learning

Face recognition has recently become ubiquitous in many scenes for authentication or security purposes. Meanwhile, there are increasing concerns about the privacy of face images, which are sensitive biometric data that should be carefully protected. Software-based cryptosystems are widely adopted nowadays to encrypt face images, but the security level is limited by insufficient digital secret key length or computing power. Hardware-based optical cryptosystems can generate enormously longer secret keys and enable encryption at light speed, but most reported optical methods, such as double random phase encryption, are less compatible with other systems due to system complexity. In this study, a plain yet high-efficient speckle-based optical cryptosystem is proposed and implemented. A scattering ground glass is exploited to generate physical secret keys of gigabit length and encrypt face images via seemingly random optical speckles at light speed. Face images can then be decrypted from the random speckles by a well-trained decryption neural network, such that face recognition can be realized with up to 98% accuracy. The proposed cryptosystem has wide applicability, and it may open a new avenue for high-security complex information encryption and decryption by utilizing optical speckles.

Viaarxiv icon