Alert button
Picture for Jiayao Zhang

Jiayao Zhang

Alert button

Estimating the Causal Effect of Early ArXiving on Paper Acceptance

Jun 24, 2023
Yanai Elazar, Jiayao Zhang, David Wadden, Bo Zhang, Noah A. Smith

Figure 1 for Estimating the Causal Effect of Early ArXiving on Paper Acceptance
Figure 2 for Estimating the Causal Effect of Early ArXiving on Paper Acceptance
Figure 3 for Estimating the Causal Effect of Early ArXiving on Paper Acceptance
Figure 4 for Estimating the Causal Effect of Early ArXiving on Paper Acceptance

What is the effect of releasing a preprint of a paper before it is submitted for peer review? No randomized controlled trial has been conducted, so we turn to observational data to answer this question. We use data from the ICLR conference (2018--2022) and apply methods from causal inference to estimate the effect of arXiving a paper before the reviewing period (early arXiving) on its acceptance to the conference. Adjusting for 18 confounders such as topic, authors, and quality, we may estimate the causal effect. However, since quality is a challenging construct to estimate, we use the negative outcome control method, using paper citation count as a control variable to debias the quality confounding effect. Our results suggest that early arXiving may have a small effect on a paper's chances of acceptance. However, this effect (when existing) does not differ significantly across different groups of authors, as grouped by author citation count and institute rank. This suggests that early arXiving does not provide an advantage to any particular group.

Viaarxiv icon

COLA: Contextualized Commonsense Causal Reasoning from the Causal Inference Perspective

May 09, 2023
Zhaowei Wang, Quyet V. Do, Hongming Zhang, Jiayao Zhang, Weiqi Wang, Tianqing Fang, Yangqiu Song, Ginny Y. Wong, Simon See

Figure 1 for COLA: Contextualized Commonsense Causal Reasoning from the Causal Inference Perspective
Figure 2 for COLA: Contextualized Commonsense Causal Reasoning from the Causal Inference Perspective
Figure 3 for COLA: Contextualized Commonsense Causal Reasoning from the Causal Inference Perspective
Figure 4 for COLA: Contextualized Commonsense Causal Reasoning from the Causal Inference Perspective

Detecting commonsense causal relations (causation) between events has long been an essential yet challenging task. Given that events are complicated, an event may have different causes under various contexts. Thus, exploiting context plays an essential role in detecting causal relations. Meanwhile, previous works about commonsense causation only consider two events and ignore their context, simplifying the task formulation. This paper proposes a new task to detect commonsense causation between two events in an event sequence (i.e., context), called contextualized commonsense causal reasoning. We also design a zero-shot framework: COLA (Contextualized Commonsense Causality Reasoner) to solve the task from the causal inference perspective. This framework obtains rich incidental supervision from temporality and balances covariates from multiple timestamps to remove confounding effects. Our extensive experiments show that COLA can detect commonsense causality more accurately than baselines.

* Accepted to the main conference of ACL 2023 
Viaarxiv icon

Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach

Nov 07, 2022
Jiayao Zhang, Hongming Zhang, Zhun Deng, Dan Roth

Figure 1 for Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach
Figure 2 for Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach
Figure 3 for Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach
Figure 4 for Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach

Double-blind peer review mechanism has become the skeleton of academic research across multiple disciplines including computer science, yet several studies have questioned the quality of peer reviews and raised concerns on potential biases in the process. In this paper, we conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs). We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date by aggregating data from OpenReview, Google Scholar, arXiv, and CSRanking, and extracting high-level features using language models. We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige. We observe that the level of disparity differs and textual features are essential in reducing biases in the predictive modeling. We distill several insights from our analysis on study the peer review process with the help of large LMs. Our database also provides avenues for studying new natural language processing (NLP) methods that facilitate the understanding of the peer review mechanism. We study a concrete example towards automatic machine review systems and provide baseline models for the review generation and scoring tasks such that the database can be used as a benchmark.

Viaarxiv icon

High-Resolution Depth Estimation for 360-degree Panoramas through Perspective and Panoramic Depth Images Registration

Oct 20, 2022
Chi-Han Peng, Jiayao Zhang

Figure 1 for High-Resolution Depth Estimation for 360-degree Panoramas through Perspective and Panoramic Depth Images Registration
Figure 2 for High-Resolution Depth Estimation for 360-degree Panoramas through Perspective and Panoramic Depth Images Registration
Figure 3 for High-Resolution Depth Estimation for 360-degree Panoramas through Perspective and Panoramic Depth Images Registration
Figure 4 for High-Resolution Depth Estimation for 360-degree Panoramas through Perspective and Panoramic Depth Images Registration

We propose a novel approach to compute high-resolution (2048x1024 and higher) depths for panoramas that is significantly faster and qualitatively and qualitatively more accurate than the current state-of-the-art method (360MonoDepth). As traditional neural network-based methods have limitations in the output image sizes (up to 1024x512) due to GPU memory constraints, both 360MonoDepth and our method rely on stitching multiple perspective disparity or depth images to come out a unified panoramic depth map. However, to achieve globally consistent stitching, 360MonoDepth relied on solving extensive disparity map alignment and Poisson-based blending problems, leading to high computation time. Instead, we propose to use an existing panoramic depth map (computed in real-time by any panorama-based method) as the common target for the individual perspective depth maps to register to. This key idea made producing globally consistent stitching results from a straightforward task. Our experiments show that our method generates qualitatively better results than existing panorama-based methods, and further outperforms them quantitatively on datasets unseen by these methods.

* IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023, to appear 
Viaarxiv icon

FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data

Jun 06, 2022
Zhun Deng, Jiayao Zhang, Linjun Zhang, Ting Ye, Yates Coley, Weijie J. Su, James Zou

Figure 1 for FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data
Figure 2 for FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data
Figure 3 for FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data
Figure 4 for FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data

Algorithmic fairness plays an important role in machine learning and imposing fairness constraints during learning is a common approach. However, many datasets are imbalanced in certain label classes (e.g. "healthy") and sensitive subgroups (e.g. "older patients"). Empirically, this imbalance leads to a lack of generalizability not only of classification, but also of fairness properties, especially in over-parameterized models. For example, fairness-aware training may ensure equalized odds (EO) on the training data, but EO is far from being satisfied on new users. In this paper, we propose a theoretically-principled, yet Flexible approach that is Imbalance-Fairness-Aware (FIFA). Specifically, FIFA encourages both classification and fairness generalization and can be flexibly combined with many existing fair learning methods with logits-based losses. While our main focus is on EO, FIFA can be directly applied to achieve equalized opportunity (EqOpt); and under certain conditions, it can also be applied to other fairness notions. We demonstrate the power of FIFA by combining it with a popular fair classification algorithm, and the resulting algorithm achieves significantly better fairness generalization on several real-world datasets.

Viaarxiv icon

Some Reflections on Drawing Causal Inference using Textual Data: Parallels Between Human Subjects and Organized Texts

Feb 02, 2022
Bo Zhang, Jiayao Zhang

We examine the role of textual data as study units when conducting causal inference by drawing parallels between human subjects and organized texts. %in human population research. We elaborate on key causal concepts and principles, and expose some ambiguity and sometimes fallacies. To facilitate better framing a causal query, we discuss two strategies: (i) shifting from immutable traits to perceptions of them, and (ii) shifting from some abstract concept/property to its constituent parts, i.e., adopting a constructivist perspective of an abstract concept. We hope this article would raise the awareness of the importance of articulating and clarifying fundamental concepts before delving into developing methodologies when drawing causal inference using textual data.

* Accepted to CLeaR 2022 
Viaarxiv icon

Causal Inference Principles for Reasoning about Commonsense Causality

Jan 31, 2022
Jiayao Zhang, Hongming Zhang, Dan Roth, Weijie J. Su

Figure 1 for Causal Inference Principles for Reasoning about Commonsense Causality
Figure 2 for Causal Inference Principles for Reasoning about Commonsense Causality
Figure 3 for Causal Inference Principles for Reasoning about Commonsense Causality
Figure 4 for Causal Inference Principles for Reasoning about Commonsense Causality

Commonsense causality reasoning (CCR) aims at identifying plausible causes and effects in natural language descriptions that are deemed reasonable by an average person. Although being of great academic and practical interest, this problem is still shadowed by the lack of a well-posed theoretical framework; existing work usually relies on deep language models wholeheartedly, and is potentially susceptible to confounding co-occurrences. Motivated by classical causal principles, we articulate the central question of CCR and draw parallels between human subjects in observational studies and natural languages to adopt CCR to the potential-outcomes framework, which is the first such attempt for commonsense tasks. We propose a novel framework, ROCK, to Reason O(A)bout Commonsense K(C)ausality, which utilizes temporal signals as incidental supervision, and balances confounding effects using temporal propensities that are analogous to propensity scores. The ROCK implementation is modular and zero-shot, and demonstrates good CCR capabilities on various datasets.

Viaarxiv icon

Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations

Oct 11, 2021
Jiayao Zhang, Hua Wang, Weijie J. Su

Figure 1 for Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
Figure 2 for Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
Figure 3 for Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
Figure 4 for Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations

Understanding the training dynamics of deep learning models is perhaps a necessary step toward demystifying the effectiveness of these models. In particular, how do data from different classes gradually become separable in their feature spaces when training neural networks using stochastic gradient descent? In this study, we model the evolution of features during deep learning training using a set of stochastic differential equations (SDEs) that each corresponds to a training sample. As a crucial ingredient in our modeling strategy, each SDE contains a drift term that reflects the impact of backpropagation at an input on the features of all samples. Our main finding uncovers a sharp phase transition phenomenon regarding the {intra-class impact: if the SDEs are locally elastic in the sense that the impact is more significant on samples from the same class as the input, the features of the training data become linearly separable, meaning vanishing training loss; otherwise, the features are not separable, regardless of how long the training time is. Moreover, in the presence of local elasticity, an analysis of our SDEs shows that the emergence of a simple geometric structure called the neural collapse of the features. Taken together, our results shed light on the decisive role of local elasticity in the training dynamics of neural networks. We corroborate our theoretical analysis with experiments on a synthesized dataset of geometric shapes and CIFAR-10.

* Accepted to NeurIPS 2021 
Viaarxiv icon

Synthetic 3D Data Generation Pipeline for Geometric Deep Learning in Architecture

Apr 26, 2021
Stanislava Fedorova, Alberto Tono, Meher Shashwat Nigam, Jiayao Zhang, Amirhossein Ahmadnia, Cecilia Bolognesi, Dominik L. Michels

Figure 1 for Synthetic 3D Data Generation Pipeline for Geometric Deep Learning in Architecture
Figure 2 for Synthetic 3D Data Generation Pipeline for Geometric Deep Learning in Architecture
Figure 3 for Synthetic 3D Data Generation Pipeline for Geometric Deep Learning in Architecture
Figure 4 for Synthetic 3D Data Generation Pipeline for Geometric Deep Learning in Architecture

With the growing interest in deep learning algorithms and computational design in the architectural field, the need for large, accessible and diverse architectural datasets increases. We decided to tackle this problem by constructing a field-specific synthetic data generation pipeline that generates an arbitrary amount of 3D data along with the associated 2D and 3D annotations. The variety of annotations, the flexibility to customize the generated building and dataset parameters make this framework suitable for multiple deep learning tasks, including geometric deep learning that requires direct 3D supervision. Creating our building data generation pipeline we leveraged architectural knowledge from experts in order to construct a framework that would be modular, extendable and would provide a sufficient amount of class-balanced data samples. Moreover, we purposefully involve the researcher in the dataset customization allowing the introduction of additional building components, material textures, building classes, number and type of annotations as well as the number of views per 3D model sample. In this way, the framework would satisfy different research requirements and would be adaptable to a large variety of tasks. All code and data are made publicly available.

* Project Page: https://cdinstitute.github.io/Building-Dataset-Generator/ 
Viaarxiv icon

Grassmannian Learning: Embedding Geometry Awareness in Shallow and Deep Learning

Aug 13, 2018
Jiayao Zhang, Guangxu Zhu, Robert W. Heath Jr., Kaibin Huang

Figure 1 for Grassmannian Learning: Embedding Geometry Awareness in Shallow and Deep Learning
Figure 2 for Grassmannian Learning: Embedding Geometry Awareness in Shallow and Deep Learning
Figure 3 for Grassmannian Learning: Embedding Geometry Awareness in Shallow and Deep Learning
Figure 4 for Grassmannian Learning: Embedding Geometry Awareness in Shallow and Deep Learning

Modern machine learning algorithms have been adopted in a range of signal-processing applications spanning computer vision, natural language processing, and artificial intelligence. Many relevant problems involve subspace-structured features, orthogonality constrained or low-rank constrained objective functions, or subspace distances. These mathematical characteristics are expressed naturally using the Grassmann manifold. Unfortunately, this fact is not yet explored in many traditional learning algorithms. In the last few years, there have been growing interests in studying Grassmann manifold to tackle new learning problems. Such attempts have been reassured by substantial performance improvements in both classic learning and learning using deep neural networks. We term the former as shallow and the latter deep Grassmannian learning. The aim of this paper is to introduce the emerging area of Grassmannian learning by surveying common mathematical problems and primary solution approaches, and overviewing various applications. We hope to inspire practitioners in different fields to adopt the powerful tool of Grassmannian learning in their research.

* Submitted to IEEE Signal Processing Magazine 
Viaarxiv icon