Extensive work has been devoted to improving the safety mechanism of Large Language Models (LLMs). However, in specific scenarios, LLMs still generate harmful responses when faced with malicious instructions, a phenomenon referred to as "Jailbreak Attack". In our research, we introduce a novel jailbreak attack method (\textbf{RADIAL}), which consists of two steps: 1) Inherent Response Tendency Analysis: we analyze the inherent affirmation and rejection tendency of LLMs to react to real-world instructions. 2) Real-World Instructions-Driven Jailbreak: based on our analysis, we strategically choose several real-world instructions and embed malicious instructions into them to amplify the LLM's potential to generate harmful responses. On three open-source human-aligned LLMs, our method achieves excellent jailbreak attack performance for both Chinese and English malicious instructions. Besides, we guided detailed ablation experiments and verified the effectiveness of our core idea "Inherent Response Tendency Analysis". Our exploration also exposes the vulnerability of LLMs to being induced into generating more detailed harmful responses in subsequent rounds of dialogue.
Uplift modeling has shown very promising results in online marketing. However, most existing works are prone to the robustness challenge in some practical applications. In this paper, we first present a possible explanation for the above phenomenon. We verify that there is a feature sensitivity problem in online marketing using different real-world datasets, where the perturbation of some key features will seriously affect the performance of the uplift model and even cause the opposite trend. To solve the above problem, we propose a novel robustness-enhanced uplift modeling framework with adversarial feature desensitization (RUAD). Specifically, our RUAD can more effectively alleviate the feature sensitivity of the uplift model through two customized modules, including a feature selection module with joint multi-label modeling to identify a key subset from the input features and an adversarial feature desensitization module using adversarial training and soft interpolation operations to enhance the robustness of the model against this selected subset of features. Finally, we conduct extensive experiments on a public dataset and a real product dataset to verify the effectiveness of our RUAD in online marketing. In addition, we also demonstrate the robustness of our RUAD to the feature sensitivity, as well as the compatibility with different uplift models.
Existing differentiable channel pruning methods often attach scaling factors or masks behind channels to prune filters with less importance, and assume uniform contribution of input samples to filter importance. Specifically, the effects of instance complexity on pruning performance are not yet fully investigated. In this paper, we propose a simple yet effective differentiable network pruning method CAP based on instance complexity-aware filter importance scores. We define instance complexity related weight for each sample by giving higher weights to hard samples, and measure the weighted sum of sample-specific soft masks to model non-uniform contribution of different inputs, which encourages hard samples to dominate the pruning process and the model performance to be well preserved. In addition, we introduce a new regularizer to encourage polarization of the masks, such that a sweet spot can be easily found to identify the filters to be pruned. Performance evaluations on various network architectures and datasets demonstrate CAP has advantages over the state-of-the-arts in pruning large networks. For instance, CAP improves the accuracy of ResNet56 on CIFAR-10 dataset by 0.33% aftering removing 65.64% FLOPs, and prunes 87.75% FLOPs of ResNet50 on ImageNet dataset with only 0.89% Top-1 accuracy loss.
In the current salient object detection network, the most popular method is using U-shape structure. However, the massive number of parameters leads to more consumption of computing and storage resources which are not feasible to deploy on the limited memory device. Some others shallow layer network will not maintain the same accuracy compared with U-shape structure and the deep network structure with more parameters will not converge to a global minimum loss with great speed. To overcome all of these disadvantages, we proposed a new deep convolution network architecture with three contributions: (1) using smaller convolution neural networks (CNNs) to compress the model in our improved salient object features compression and reinforcement extraction module (ISFCREM) to reduce parameters of the model. (2) introducing channel attention mechanism in ISFCREM to weigh different channels for improving the ability of feature representation. (3) applying a new optimizer to accumulate the long-term gradient information during training to adaptively tune the learning rate. The results demonstrate that the proposed method can compress the model to 1/3 of the original size nearly without losing the accuracy and converging faster and more smoothly on six widely used datasets of salient object detection compared with the others models. Our code is published in https://gitee.com/binzhangbinzhangbin/code-a-novel-attention-based-network-for-fast-salient-object-detection.git
Purpose: Segmentation of organs-at-risk (OARs) is a bottleneck in current radiation oncology pipelines and is often time consuming and labor intensive. In this paper, we propose an atlas-based semi-supervised registration algorithm to generate accurate segmentations of OARs for which there are ground truth contours and rough segmentations of all other OARs in the atlas. To the best of our knowledge, this is the first study to use learning-based registration methods for the segmentation of head and neck patients and demonstrate its utility in clinical applications. Methods: Our algorithm cascades rigid and deformable deformation blocks, and takes on an atlas image (M), set of atlas-space segmentations (S_A), and a patient image (F) as inputs, while outputting patient-space segmentations of all OARs defined on the atlas. We train our model on 475 CT images taken from public archives and Stanford RadOnc Clinic (SROC), validate on 5 CT images from SROC, and test our model on 20 CT images from SROC. Results: Our method outperforms current state of the art learning-based registration algorithms and achieves an overall dice score of 0.789 on our test set. Moreover, our method yields a performance comparable to manual segmentation and supervised segmentation, while solving a much more complex registration problem. Whereas supervised segmentation methods only automate the segmentation process for a select few number of OARs, we demonstrate that our methods can achieve similar performance for OARs of interest, while also providing segmentations for every other OAR on the provided atlas. Conclusions: Our proposed algorithm has significant clinical applications and could help reduce the bottleneck for segmentation of head and neck OARs. Further, our results demonstrate that semi-supervised diffeomorphic registration can be accurately applied to both registration and segmentation problems.