We present a scholarly corpus from the ACL Anthology to assist Open scientific research in the Computational Linguistics domain, named as ACL OCL. Compared with previous ARC and AAN versions, ACL OCL includes structured full-texts with logical sections, references to figures, and links to a large knowledge resource (semantic scholar). ACL OCL contains 74k scientific papers, together with 210k figures extracted up to September 2022. To observe the development in the computational linguistics domain, we detect the topics of all OCL papers with a supervised neural model. We observe ''Syntax: Tagging, Chunking and Parsing'' topic is significantly shrinking and ''Natural Language Generation'' is resurging. Our dataset is open and available to download from HuggingFace in https://huggingface.co/datasets/ACL-OCL/ACL-OCL-Corpus.
Since previous studies on open-domain targeted sentiment analysis are limited in dataset domain variety and sentence level, we propose a novel dataset consisting of 6,013 human-labeled data to extend the data domains in topics of interest and document level. Furthermore, we offer a nested target annotation schema to extract the complete sentiment information in documents, boosting the practicality and effectiveness of open-domain targeted sentiment analysis. Moreover, we leverage the pre-trained model BART in a sequence-to-sequence generation method for the task. Benchmark results show that there exists large room for improvement of open-domain targeted sentiment analysis. Meanwhile, experiments have shown that challenges remain in the effective use of open-domain data, long documents, the complexity of target structure, and domain variances.