Alert button
Picture for Zhiyang Teng

Zhiyang Teng

Alert button

Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis

May 25, 2023
Xuming Hu, Zhijiang Guo, Zhiyang Teng, Irwin King, Philip S. Yu

Figure 1 for Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis
Figure 2 for Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis
Figure 3 for Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis
Figure 4 for Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis

Multimodal relation extraction (MRE) is the task of identifying the semantic relationships between two entities based on the context of the sentence image pair. Existing retrieval-augmented approaches mainly focused on modeling the retrieved textual knowledge, but this may not be able to accurately identify complex relations. To improve the prediction, this research proposes to retrieve textual and visual evidence based on the object, sentence, and whole image. We further develop a novel approach to synthesize the object-level, image-level, and sentence-level information for better reasoning between the same and different modalities. Extensive experiments and analyses show that the proposed method is able to effectively select and compare evidence across modalities and significantly outperforms state-of-the-art models.

* Accepted to ACL 2023 
Viaarxiv icon

LogicLLM: Exploring Self-supervised Logic-enhanced Training for Large Language Models

May 24, 2023
Fangkai Jiao, Zhiyang Teng, Shafiq Joty, Bosheng Ding, Aixin Sun, Zhengyuan Liu, Nancy F. Chen

Figure 1 for LogicLLM: Exploring Self-supervised Logic-enhanced Training for Large Language Models
Figure 2 for LogicLLM: Exploring Self-supervised Logic-enhanced Training for Large Language Models
Figure 3 for LogicLLM: Exploring Self-supervised Logic-enhanced Training for Large Language Models
Figure 4 for LogicLLM: Exploring Self-supervised Logic-enhanced Training for Large Language Models

Existing efforts to improve logical reasoning ability of language models have predominantly relied on supervised fine-tuning, hindering generalization to new domains and/or tasks. The development of Large Langauge Models (LLMs) has demonstrated the capacity of compressing abundant knowledge into a single proxy, enabling them to tackle multiple tasks effectively. Our preliminary experiments, nevertheless, show that LLMs do not show capability on logical reasoning. The performance of LLMs on logical reasoning benchmarks is far behind the existing state-of-the-art baselines. In this paper, we make the first attempt to investigate the feasibility of incorporating logical knowledge through self-supervised post-training, and activating it via in-context learning, which we termed as LogicLLM. Specifically, we devise an auto-regressive objective variant of MERIt and integrate it with two LLM series, i.e., FLAN-T5 and LLaMA, with parameter size ranging from 3 billion to 13 billion. The results on two challenging logical reasoning benchmarks demonstrate the effectiveness of LogicLLM. Besides, we conduct extensive ablation studies to analyze the key factors in designing logic-oriented proxy tasks.

* 11 pages 
Viaarxiv icon

Non-Autoregressive Document-Level Machine Translation (NA-DMT): Exploring Effective Approaches, Challenges, and Opportunities

May 22, 2023
Guangsheng Bao, Zhiyang Teng, Yue Zhang

Figure 1 for Non-Autoregressive Document-Level Machine Translation (NA-DMT): Exploring Effective Approaches, Challenges, and Opportunities
Figure 2 for Non-Autoregressive Document-Level Machine Translation (NA-DMT): Exploring Effective Approaches, Challenges, and Opportunities
Figure 3 for Non-Autoregressive Document-Level Machine Translation (NA-DMT): Exploring Effective Approaches, Challenges, and Opportunities
Figure 4 for Non-Autoregressive Document-Level Machine Translation (NA-DMT): Exploring Effective Approaches, Challenges, and Opportunities

Non-autoregressive translation (NAT) models have been extensively investigated within the context of sentence-level machine translation (MT) tasks, demonstrating comparable quality and superior translation speed when contrasted with autoregressive translation (AT) models. However, the challenges associated with multi-modality and alignment issues within NAT models become more prominent when increasing input and output length, leading to unexpected complications in document-level MT. In this paper, we conduct a comprehensive examination of typical NAT models in the context of document-level MT tasks. Experiments reveal that, although NAT models significantly accelerate text generation on documents, they do not perform as effectively as on sentences. To bridge this performance gap, we introduce a novel design that underscores the importance of sentence-level alignment for non-autoregressive document-level machine translation (NA-DMT). This innovation substantially reduces the performance discrepancy. However, it is worth noting that NA-DMT models are still far from perfect and may necessitate additional research to fully optimize their performance. We delve into the related opportunities and challenges and provide our code at https://github.com/baoguangsheng/nat-on-doc to stimulate further research in this field.

* 6 pages, 6 tables 
Viaarxiv icon

LogiCoT: Logical Chain-of-Thought Instruction-Tuning Data Collection with GPT-4

May 20, 2023
Hanmeng Liu, Zhiyang Teng, Leyang Cui, Chaoli Zhang, Qiji Zhou, Yue Zhang

Figure 1 for LogiCoT: Logical Chain-of-Thought Instruction-Tuning Data Collection with GPT-4
Figure 2 for LogiCoT: Logical Chain-of-Thought Instruction-Tuning Data Collection with GPT-4
Figure 3 for LogiCoT: Logical Chain-of-Thought Instruction-Tuning Data Collection with GPT-4

Generative Pre-trained Transformer 4 (GPT-4) demonstrates impressive chain-of-thought reasoning ability. Recent work on self-instruction tuning, such as Alpaca, has focused on enhancing the general proficiency of models. These instructions enable the model to achieve performance comparable to GPT-3.5 on general tasks like open-domain text generation and paraphrasing. However, they fall short of helping the model handle complex reasoning tasks. To bridge the gap, this paper presents LogiCoT, a new instruction-tuning dataset for Logical Chain-of-Thought reasoning with GPT-4. We elaborate on the process of harvesting instructions for prompting GPT-4 to generate chain-of-thought rationales. LogiCoT serves as an instruction set for teaching models of logical reasoning and elicits general reasoning skills.

Viaarxiv icon

Target-Side Augmentation for Document-Level Machine Translation

May 08, 2023
Guangsheng Bao, Zhiyang Teng, Yue Zhang

Figure 1 for Target-Side Augmentation for Document-Level Machine Translation
Figure 2 for Target-Side Augmentation for Document-Level Machine Translation
Figure 3 for Target-Side Augmentation for Document-Level Machine Translation
Figure 4 for Target-Side Augmentation for Document-Level Machine Translation

Document-level machine translation faces the challenge of data sparsity due to its long input length and a small amount of training data, increasing the risk of learning spurious patterns. To address this challenge, we propose a target-side augmentation method, introducing a data augmentation (DA) model to generate many potential translations for each source document. Learning on these wider range translations, an MT model can learn a smoothed distribution, thereby reducing the risk of data sparsity. We demonstrate that the DA model, which estimates the posterior distribution, largely improves the MT performance, outperforming the previous best system by 2.30 s-BLEU on News and achieving new state-of-the-art on News and Europarl benchmarks. Our code is available at \url{https://github.com/baoguangsheng/target-side-augmentation}.

* Accepted by ACL2023 main conference 
Viaarxiv icon

Token-level Fitting Issues of Seq2seq Models

May 08, 2023
Guangsheng Bao, Zhiyang Teng, Yue Zhang

Figure 1 for Token-level Fitting Issues of Seq2seq Models
Figure 2 for Token-level Fitting Issues of Seq2seq Models
Figure 3 for Token-level Fitting Issues of Seq2seq Models
Figure 4 for Token-level Fitting Issues of Seq2seq Models

Sequence-to-sequence (seq2seq) models have been widely used for natural language processing, computer vision, and other deep learning tasks. We find that seq2seq models trained with early-stopping suffer from issues at the token level. In particular, while some tokens in the vocabulary demonstrate overfitting, others underfit when training is stopped. Experiments show that the phenomena are pervasive in different models, even in fine-tuned large pretrained-models. We identify three major factors that influence token-level fitting, which include token frequency, parts-of-speech, and prediction discrepancy. Further, we find that external factors such as language, model size, domain, data scale, and pretraining can also influence the fitting of tokens.

* 12 pages, 17 figures, and 7 tables 
Viaarxiv icon

Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4

Apr 20, 2023
Hanmeng Liu, Ruoxi Ning, Zhiyang Teng, Jian Liu, Qiji Zhou, Yue Zhang

Figure 1 for Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4
Figure 2 for Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4
Figure 3 for Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4
Figure 4 for Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4

Harnessing logical reasoning ability is a comprehensive natural language understanding endeavor. With the release of Generative Pretrained Transformer 4 (GPT-4), highlighted as "advanced" at reasoning tasks, we are eager to learn the GPT-4 performance on various logical reasoning tasks. This report analyses multiple logical reasoning datasets, with popular benchmarks like LogiQA and ReClor, and newly-released datasets like AR-LSAT. We test the multi-choice reading comprehension and natural language inference tasks with benchmarks requiring logical reasoning. We further construct a logical reasoning out-of-distribution dataset to investigate the robustness of ChatGPT and GPT-4. We also make a performance comparison between ChatGPT and GPT-4. Experiment results show that ChatGPT performs significantly better than the RoBERTa fine-tuning method on most logical reasoning benchmarks. With early access to the GPT-4 API we are able to conduct intense experiments on the GPT-4 model. The results show GPT-4 yields even higher performance on most logical reasoning datasets. Among benchmarks, ChatGPT and GPT-4 do relatively well on well-known datasets like LogiQA and ReClor. However, the performance drops significantly when handling newly released and out-of-distribution datasets. Logical reasoning remains challenging for ChatGPT and GPT-4, especially on out-of-distribution and natural language inference datasets. We release the prompt-style logical reasoning datasets as a benchmark suite and name it LogiEval.

Viaarxiv icon

YATO: Yet Another deep learning based Text analysis Open toolkit

Sep 28, 2022
Zeqiang Wang, Yile Wang, Jiageng Wu, Zhiyang Teng, Jie Yang

Figure 1 for YATO: Yet Another deep learning based Text analysis Open toolkit
Figure 2 for YATO: Yet Another deep learning based Text analysis Open toolkit
Figure 3 for YATO: Yet Another deep learning based Text analysis Open toolkit
Figure 4 for YATO: Yet Another deep learning based Text analysis Open toolkit

We introduce YATO, an open-source toolkit for text analysis with deep learning. It focuses on fundamental sequence labeling and sequence classification tasks on text. Designed in a hierarchical structure, YATO supports free combinations of three types of features including 1) traditional neural networks (CNN, RNN, etc.); 2) pre-trained language models (BERT, RoBERTa, ELECTRA, etc.); and 3) user-customed neural features via a simple configurable file. Benefiting from the advantages of flexibility and ease of use, YATO can facilitate reproducing and refinement of state-of-the-art NLP models, and promote the cross-disciplinary applications of NLP techniques. Source code, examples, and documentation are publicly available at https://github.com/jiesutd/YATO.

Viaarxiv icon

METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets

Sep 28, 2022
Peilin Zhou, Zeqiang Wang, Dading Chong, Zhijiang Guo, Yining Hua, Zichang Su, Zhiyang Teng, Jiageng Wu, Jie Yang

Figure 1 for METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets
Figure 2 for METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets
Figure 3 for METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets
Figure 4 for METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets

The COVID-19 pandemic continues to bring up various topics discussed or debated on social media. In order to explore the impact of pandemics on people's lives, it is crucial to understand the public's concerns and attitudes towards pandemic-related entities (e.g., drugs, vaccines) on social media. However, models trained on existing named entity recognition (NER) or targeted sentiment analysis (TSA) datasets have limited ability to understand COVID-19-related social media texts because these datasets are not designed or annotated from a medical perspective. This paper releases METS-CoV, a dataset containing medical entities and targeted sentiments from COVID-19-related tweets. METS-CoV contains 10,000 tweets with 7 types of entities, including 4 medical entity types (Disease, Drug, Symptom, and Vaccine) and 3 general entity types (Person, Location, and Organization). To further investigate tweet users' attitudes toward specific entities, 4 types of entities (Person, Organization, Drug, and Vaccine) are selected and annotated with user sentiments, resulting in a targeted sentiment dataset with 9,101 entities (in 5,278 tweets). To the best of our knowledge, METS-CoV is the first dataset to collect medical entities and corresponding sentiments of COVID-19-related tweets. We benchmark the performance of classical machine learning models and state-of-the-art deep learning models on NER and TSA tasks with extensive experiments. Results show that the dataset has vast room for improvement for both NER and TSA tasks. METS-CoV is an important resource for developing better medical social media tools and facilitating computational social science research, especially in epidemiology. Our data, annotation guidelines, benchmark models, and source code are publicly available (https://github.com/YLab-Open/METS-CoV) to ensure reproducibility.

* 10 pages, 6 figures, 6 tables, accepted by NeurIPS 2022 Datasets and Benchmarks track 
Viaarxiv icon