Alert button
Picture for Wei Zhou

Wei Zhou

Alert button

Are Large Language Models Good Fact Checkers: A Preliminary Study

Nov 29, 2023
Han Cao, Lingwei Wei, Mengyang Chen, Wei Zhou, Songlin Hu

Recently, Large Language Models (LLMs) have drawn significant attention due to their outstanding reasoning capabilities and extensive knowledge repository, positioning them as superior in handling various natural language processing tasks compared to other language models. In this paper, we present a preliminary investigation into the potential of LLMs in fact-checking. This study aims to comprehensively evaluate various LLMs in tackling specific fact-checking subtasks, systematically evaluating their capabilities, and conducting a comparative analysis of their performance against pre-trained and state-of-the-art low-parameter models. Experiments demonstrate that LLMs achieve competitive performance compared to other small models in most scenarios. However, they encounter challenges in effectively handling Chinese fact verification and the entirety of the fact-checking pipeline due to language inconsistencies and hallucinations. These findings underscore the need for further exploration and research to enhance the proficiency of LLMs as reliable fact-checkers, unveiling the potential capability of LLMs and the possible challenges in fact-checking tasks.

Viaarxiv icon

Double-Flow-based Steganography without Embedding for Image-to-Image Hiding

Nov 25, 2023
Bingbing Song, Derui Wang, Tianwei Zhang, Renyang Liu, Yu Lin, Wei Zhou

As an emerging concept, steganography without embedding (SWE) hides a secret message without directly embedding it into a cover. Thus, SWE has the unique advantage of being immune to typical steganalysis methods and can better protect the secret message from being exposed. However, existing SWE methods are generally criticized for their poor payload capacity and low fidelity of recovered secret messages. In this paper, we propose a novel steganography-without-embedding technique, named DF-SWE, which addresses the aforementioned drawbacks and produces diverse and natural stego images. Specifically, DF-SWE employs a reversible circulation of double flow to build a reversible bijective transformation between the secret image and the generated stego image. Hence, it provides a way to directly generate stego images from secret images without a cover image. Besides leveraging the invertible property, DF-SWE can invert a secret image from a generated stego image in a nearly lossless manner and increases the fidelity of extracted secret images. To the best of our knowledge, DF-SWE is the first SWE method that can hide large images and multiple images into one image with the same size, significantly enhancing the payload capacity. According to the experimental results, the payload capacity of DF-SWE achieves 24-72 BPP is 8000-16000 times compared to its competitors while producing diverse images to minimize the exposure risk. Importantly, DF-SWE can be applied in the steganography of secret images in various domains without requiring training data from the corresponding domains. This domain-agnostic property suggests that DF-SWE can 1) be applied to hiding private data and 2) be deployed in resource-limited systems.

Viaarxiv icon

Can Large Language Models Understand Content and Propagation for Misinformation Detection: An Empirical Study

Nov 21, 2023
Mengyang Chen, Lingwei Wei, Han Cao, Wei Zhou, Songlin Hu

Large Language Models (LLMs) have garnered significant attention for their powerful ability in natural language understanding and reasoning. In this paper, we present a comprehensive empirical study to explore the performance of LLMs on misinformation detection tasks. This study stands as the pioneering investigation into the understanding capabilities of multiple LLMs regarding both content and propagation across social media platforms. Our empirical studies on five misinformation detection datasets show that LLMs with diverse prompts achieve comparable performance in text-based misinformation detection but exhibit notably constrained capabilities in comprehending propagation structure compared to existing models in propagation-based misinformation detection. Besides, we further design four instruction-tuned strategies to enhance LLMs for both content and propagation-based misinformation detection. These strategies boost LLMs to actively learn effective features from multiple instances or hard instances, and eliminate irrelevant propagation structures, thereby achieving better detection performance. Extensive experiments further demonstrate LLMs would play a better capacity in content and propagation structure under these proposed strategies and achieve promising detection performance. These findings highlight the potential ability of LLMs to detect misinformation.

Viaarxiv icon

CT-GAT: Cross-Task Generative Adversarial Attack based on Transferability

Nov 05, 2023
Minxuan Lv, Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu

Figure 1 for CT-GAT: Cross-Task Generative Adversarial Attack based on Transferability
Figure 2 for CT-GAT: Cross-Task Generative Adversarial Attack based on Transferability
Figure 3 for CT-GAT: Cross-Task Generative Adversarial Attack based on Transferability
Figure 4 for CT-GAT: Cross-Task Generative Adversarial Attack based on Transferability

Neural network models are vulnerable to adversarial examples, and adversarial transferability further increases the risk of adversarial attacks. Current methods based on transferability often rely on substitute models, which can be impractical and costly in real-world scenarios due to the unavailability of training data and the victim model's structural details. In this paper, we propose a novel approach that directly constructs adversarial examples by extracting transferable features across various tasks. Our key insight is that adversarial transferability can extend across different tasks. Specifically, we train a sequence-to-sequence generative model named CT-GAT using adversarial sample data collected from multiple tasks to acquire universal adversarial features and generate adversarial examples for different tasks. We conduct experiments on ten distinct datasets, and the results demonstrate that our method achieves superior attack performance with small cost.

* Accepted to EMNLP 2023 main conference Corrected the header error in Figure 3 
Viaarxiv icon

Enhancing English Writing Proficiency in China's Polytechnic Students An In-Depth Literature Review on the Application of the Input Hypothesis

Nov 04, 2023
Wei Zhou

Having good English writing skills is extremely important for students in polytechnic institutions. However, a lot of students in technical schools have difficulties in reaching high levels of skill. The Input Hypothesis, created by Stephen Krashen, suggests that people learn languages well when they receive information that's a little harder than what they already know but still understandable. This research paper wants to study how the Input Hypothesis can help polytechnic students improve their English writing skills. The study will include real-life observations and experiments from the previous research. We will look at data from polytechnic students who are receiving special writing instruction to see if the Input Hypothesis actually helps improve their writing skills. The paper can better inform polytechnic students, faculty members, and support staff and even members of the larger community about the attributions, the processes, and the possible outcomes of second language development for polytechnic students. Keywords: English writing skills, Polytechnic students, Input hypothesis, Comprehensible input

* 12 pages 
Viaarxiv icon

InfoEntropy Loss to Mitigate Bias of Learning Difficulties for Generative Language Models

Nov 01, 2023
Zhenpeng Su, Xing Wu, Xue Bai, Zijia Lin, Hui Chen, Guiguang Ding, Wei Zhou, Songlin Hu

Generative language models are usually pretrained on large text corpus via predicting the next token (i.e., sub-word/word/phrase) given the previous ones. Recent works have demonstrated the impressive performance of large generative language models on downstream tasks. However, existing generative language models generally neglect an inherent challenge in text corpus during training, i.e., the imbalance between frequent tokens and infrequent ones. It can lead a language model to be dominated by common and easy-to-learn tokens, thereby overlooking the infrequent and difficult-to-learn ones. To alleviate that, we propose an Information Entropy Loss (InfoEntropy Loss) function. During training, it can dynamically assess the learning difficulty of a to-be-learned token, according to the information entropy of the corresponding predicted probability distribution over the vocabulary. Then it scales the training loss adaptively, trying to lead the model to focus more on the difficult-to-learn tokens. On the Pile dataset, we train generative language models at different scales of 436M, 1.1B, and 6.7B parameters. Experiments reveal that models incorporating the proposed InfoEntropy Loss can gain consistent performance improvement on downstream benchmarks.

Viaarxiv icon

MeaeQ: Mount Model Extraction Attacks with Efficient Queries

Oct 21, 2023
Chengwei Dai, Minxuan Lv, Kun Li, Wei Zhou

Figure 1 for MeaeQ: Mount Model Extraction Attacks with Efficient Queries
Figure 2 for MeaeQ: Mount Model Extraction Attacks with Efficient Queries
Figure 3 for MeaeQ: Mount Model Extraction Attacks with Efficient Queries
Figure 4 for MeaeQ: Mount Model Extraction Attacks with Efficient Queries

We study model extraction attacks in natural language processing (NLP) where attackers aim to steal victim models by repeatedly querying the open Application Programming Interfaces (APIs). Recent works focus on limited-query budget settings and adopt random sampling or active learning-based sampling strategies on publicly available, unannotated data sources. However, these methods often result in selected queries that lack task relevance and data diversity, leading to limited success in achieving satisfactory results with low query costs. In this paper, we propose MeaeQ (Model extraction attack with efficient Queries), a straightforward yet effective method to address these issues. Specifically, we initially utilize a zero-shot sequence inference classifier, combined with API service information, to filter task-relevant data from a public text corpus instead of a problem domain-specific dataset. Furthermore, we employ a clustering-based data reduction technique to obtain representative data as queries for the attack. Extensive experiments conducted on four benchmark datasets demonstrate that MeaeQ achieves higher functional similarity to the victim model than baselines while requiring fewer queries. Our code is available at

* Accepted by EMNLP 2023 main conference 
Viaarxiv icon

How Biomimetic Morphing Dorsal Fin Affects the Swimming Performance of a Free-swimming Tuna Robot

Oct 19, 2023
Hongbing Huang, Zhonglu Lin, Wei Zheng, Jinhu Zhang, Wei Zhou, Yu Zhang

It is well known that tuna fish in the ocean can dynamically morph their median fins to achieve optimal hydrodynamic performance, e.g. linear acceleration and maneuverability. In this study, based on the previous studies about the median fin's hydrodynamic effects focusing on tethered conditions, we continue to explore the hydrodynamic function of tuna morphing dorsal fin in free swimming conditions for better approaching real-life situations.Here, we developed a tuna-inspired robotic fish platform that can swim independently in three dimensions, equipped with a biomimetic morphing dorsal fin magnetically attached to the robotic fish. Based on the free-swimming robotic fish platform, we investigated how the erected dorsal fin affects the speed, cost of transport (COT), and robotic fish's yaw angle at different frequencies and amplitudes. The erected dorsal fin plays a positive role in improving the yaw stability of robotic fish. However, it shows little influence on the speed and COT in our test. This remains to be further investigated in the future.

* 10 pages, 5 figures, 2 tables 
Viaarxiv icon

Can LSH (Locality-Sensitive Hashing) Be Replaced by Neural Network?

Oct 15, 2023
Renyang Liu, Jun Zhao, Xing Chu, Yu Liang, Wei Zhou, Jing He

With the rapid development of GPU (Graphics Processing Unit) technologies and neural networks, we can explore more appropriate data structures and algorithms. Recent progress shows that neural networks can partly replace traditional data structures. In this paper, we proposed a novel DNN (Deep Neural Network)-based learned locality-sensitive hashing, called LLSH, to efficiently and flexibly map high-dimensional data to low-dimensional space. LLSH replaces the traditional LSH (Locality-sensitive Hashing) function families with parallel multi-layer neural networks, which reduces the time and memory consumption and guarantees query accuracy simultaneously. The proposed LLSH demonstrate the feasibility of replacing the hash index with learning-based neural networks and open a new door for developers to design and configure data organization more accurately to improve information-searching performance. Extensive experiments on different types of datasets show the superiority of the proposed method in query accuracy, time consumption, and memory usage.

Viaarxiv icon