Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qing Wang

Automatic Comment Generation via Multi-Pass Deliberation

Sep 14, 2022
Fangwen Mu, Xiao Chen, Lin Shi, Song Wang, Qing Wang

Figure 1 for Automatic Comment Generation via Multi-Pass Deliberation

Figure 2 for Automatic Comment Generation via Multi-Pass Deliberation

Figure 3 for Automatic Comment Generation via Multi-Pass Deliberation

Figure 4 for Automatic Comment Generation via Multi-Pass Deliberation

Deliberation is a common and natural behavior in human daily life. For example, when writing papers or articles, we usually first write drafts, and then iteratively polish them until satisfied. In light of such a human cognitive process, we propose DECOM, which is a multi-pass deliberation framework for automatic comment generation. DECOM consists of multiple Deliberation Models and one Evaluation Model. Given a code snippet, we first extract keywords from the code and retrieve a similar code fragment from a pre-defined corpus. Then, we treat the comment of the retrieved code as the initial draft and input it with the code and keywords into DECOM to start the iterative deliberation process. At each deliberation, the deliberation model polishes the draft and generates a new comment. The evaluation model measures the quality of the newly generated comment to determine whether to end the iterative process or not. When the iterative process is terminated, the best-generated comment will be selected as the target comment. Our approach is evaluated on two real-world datasets in Java (87K) and Python (108K), and experiment results show that our approach outperforms the state-of-the-art baselines. A human evaluation study also confirms the comments generated by DECOM tend to be more readable, informative, and useful.

Via

Access Paper or Ask Questions

Tensor Decomposition based Personalized Federated Learning

Aug 27, 2022
Qing Wang, Jing Jin, Xiaofeng Liu, Huixuan Zong, Yunfeng Shao, Yinchuan Li

Figure 1 for Tensor Decomposition based Personalized Federated Learning

Figure 2 for Tensor Decomposition based Personalized Federated Learning

Figure 3 for Tensor Decomposition based Personalized Federated Learning

Figure 4 for Tensor Decomposition based Personalized Federated Learning

Federated learning (FL) is a new distributed machine learning framework that can achieve reliably collaborative training without collecting users' private data. However, due to FL's frequent communication and average aggregation strategy, they experience challenges scaling to statistical diversity data and large-scale models. In this paper, we propose a personalized FL framework, named Tensor Decomposition based Personalized Federated learning (TDPFed), in which we design a novel tensorized local model with tensorized linear layers and convolutional layers to reduce the communication cost. TDPFed uses a bi-level loss function to decouple personalized model optimization from the global model learning by controlling the gap between the personalized model and the tensorized local model. Moreover, an effective distributed learning strategy and two different model aggregation strategies are well designed for the proposed TDPFed framework. Theoretical convergence analysis and thorough experiments demonstrate that our proposed TDPFed framework achieves state-of-the-art performance while reducing the communication cost.

Via

Access Paper or Ask Questions

DBE-KT22: A Knowledge Tracing Dataset Based on Online Student Evaluation

Aug 19, 2022
Ghodai Abdelrahman, Sherif Abdelfattah, Qing Wang, Yu Lin

Figure 1 for DBE-KT22: A Knowledge Tracing Dataset Based on Online Student Evaluation

Figure 2 for DBE-KT22: A Knowledge Tracing Dataset Based on Online Student Evaluation

Figure 3 for DBE-KT22: A Knowledge Tracing Dataset Based on Online Student Evaluation

Figure 4 for DBE-KT22: A Knowledge Tracing Dataset Based on Online Student Evaluation

Online education has gained an increasing importance over the last decade for providing affordable high-quality education to students worldwide. This has been further magnified during the global pandemic as more students switched to study online. The majority of online education tasks, e.g., course recommendation, exercise recommendation, or automated evaluation, depends on tracking students' knowledge progress. This is known as the \emph{Knowledge Tracing} problem in the literature. Addressing this problem requires collecting student evaluation data that can reflect their knowledge evolution over time. In this paper, we propose a new knowledge tracing dataset named Database Exercises for Knowledge Tracing (DBE-KT22) that is collected from an online student exercise system in a course taught at the Australian National University in Australia. We discuss the characteristics of the DBE-KT22 dataset and contrast it with the existing datasets in the knowledge tracing literature. Our dataset is available for public access through the Australian Data Archive platform.

Via

Access Paper or Ask Questions

Backend Ensemble for Speaker Verification and Spoofing Countermeasure

Jul 05, 2022
Li Zhang, Yue Li, Huan Zhao, Qing Wang, Lei Xie

Figure 1 for Backend Ensemble for Speaker Verification and Spoofing Countermeasure

Figure 2 for Backend Ensemble for Speaker Verification and Spoofing Countermeasure

Figure 3 for Backend Ensemble for Speaker Verification and Spoofing Countermeasure

Figure 4 for Backend Ensemble for Speaker Verification and Spoofing Countermeasure

This paper describes the NPU system submitted to Spoofing Aware Speaker Verification Challenge 2022. We particularly focus on the \textit{backend ensemble} for speaker verification and spoofing countermeasure from three aspects. Firstly, besides simple concatenation, we propose circulant matrix transformation and stacking for speaker embeddings and countermeasure embeddings. With the stacking operation of newly-defined circulant embeddings, we almost explore all the possible interactions between speaker embeddings and countermeasure embeddings. Secondly, we attempt different convolution neural networks to selectively fuse the embeddings' salient regions into channels with convolution kernels. Finally, we design parallel attention in 1D convolution neural networks to learn the global correlation in channel dimensions as well as to learn the important parts in feature dimensions. Meanwhile, we embed squeeze-and-excitation attention in 2D convolutional neural networks to learn the global dependence among speaker embeddings and countermeasure embeddings. Experimental results demonstrate that all the above methods are effective. After fusion of four well-trained models enhanced by the mentioned methods, the best SASV-EER, SPF-EER and SV-EER we achieve are 0.559\%, 0.354\% and 0.857\% on the evaluation set respectively. Together with the above contributions, our submission system achieves the fifth place in this challenge.

Via

Access Paper or Ask Questions

Restructuring Graph for Higher Homophily via Learnable Spectral Clustering

Jun 06, 2022
Shouheng Li, Dongwoo Kim, Qing Wang

Figure 1 for Restructuring Graph for Higher Homophily via Learnable Spectral Clustering

Figure 2 for Restructuring Graph for Higher Homophily via Learnable Spectral Clustering

Figure 3 for Restructuring Graph for Higher Homophily via Learnable Spectral Clustering

Figure 4 for Restructuring Graph for Higher Homophily via Learnable Spectral Clustering

While a growing body of literature has been studying new Graph Neural Networks (GNNs) that work on both homophilic and heterophilic graphs, little work has been done on adapting classical GNNs to less-homophilic graphs. Although lacking the ability to work with less-homophilic graphs, classical GNNs still stand out in some properties such as efficiency, simplicity and explainability. We propose a novel graph restructuring method to maximize the benefit of prevalent GNNs with the homophilic assumption. Our contribution is threefold: a) learning the weight of pseudo-eigenvectors for an adaptive spectral clustering that aligns well with known node labels, b) proposing a new homophilic metric that measures how two nodes with the same label are likely to be connected, and c) reconstructing the adjacency matrix based on the result of adaptive spectral clustering to maximize the homophilic scores. The experimental results show that our graph restructuring method can significantly boost the performance of six classical GNNs by an average of 25% on less-homophilic graphs. The boosted performance is comparable to state-of-the-art methods.

* 17 pages, 8 figures

Via

Access Paper or Ask Questions

NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results

Apr 25, 2022
Ren Yang, Radu Timofte, Meisong Zheng, Qunliang Xing, Minglang Qiao, Mai Xu, Lai Jiang, Huaida Liu, Ying Chen, Youcheng Ben, Xiao Zhou, Chen Fu, Pei Cheng, Gang Yu, Junyi Li, Renlong Wu, Zhilu Zhang, Wei Shang, Zhengyao Lv, Yunjin Chen, Mingcai Zhou, Dongwei Ren, Kai Zhang, Wangmeng Zuo, Pavel Ostyakov, Vyal Dmitry, Shakarim Soltanayev, Chervontsev Sergey, Zhussip Magauiya, Xueyi Zou, Youliang Yan, Pablo Navarrete Michelini, Yunhua Lu, Diankai Zhang, Shaoli Liu, Si Gao, Biao Wu, Chengjian Zheng, Xiaofeng Zhang, Kaidi Lu, Ning Wang, Thuong Nguyen Canh, Thong Bach, Qing Wang, Xiaopeng Sun, Haoyu Ma, Shijie Zhao, Junlin Li, Liangbin Xie, Shuwei Shi, Yujiu Yang, Xintao Wang, Jinjin Gu, Chao Dong, Xiaodi Shi, Chunmei Nian, Dong Jiang, Jucai Lin, Zhihuai Xie, Mao Ye, Dengyan Luo, Liuhan Peng, Shengjie Chen, Xin Liu, Qian Wang, Xin Liu, Boyang Liang, Hang Dong, Yuhao Huang, Kai Chen, Xingbei Guo, Yujing Sun, Huilei Wu, Pengxu Wei, Yulin Huang, Junying Chen, Ik Hyun Lee, Sunder Ali Khowaja, Jiseok Yoon

Figure 1 for NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results

Figure 2 for NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results

Figure 3 for NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results

Figure 4 for NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results

This paper reviews the NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video. In this challenge, we proposed the LDV 2.0 dataset, which includes the LDV dataset (240 videos) and 95 additional videos. This challenge includes three tracks. Track 1 aims at enhancing the videos compressed by HEVC at a fixed QP. Track 2 and Track 3 target both the super-resolution and quality enhancement of HEVC compressed video. They require x2 and x4 super-resolution, respectively. The three tracks totally attract more than 600 registrations. In the test phase, 8 teams, 8 teams and 12 teams submitted the final results to Tracks 1, 2 and 3, respectively. The proposed methods and solutions gauge the state-of-the-art of super-resolution and quality enhancement of compressed video. The proposed LDV 2.0 dataset is available at https://github.com/RenYang-home/LDV_dataset. The homepage of this challenge (including open-sourced codes) is at https://github.com/RenYang-home/NTIRE22_VEnh_SR.

Via

Access Paper or Ask Questions

Epipolar Focus Spectrum: A Novel Light Field Representation and Application in Dense-view Reconstruction

Apr 01, 2022
Yaning Li, Xue Wang, Hao Zhu, Guoqing Zhou, Qing Wang

Figure 1 for Epipolar Focus Spectrum: A Novel Light Field Representation and Application in Dense-view Reconstruction

Figure 2 for Epipolar Focus Spectrum: A Novel Light Field Representation and Application in Dense-view Reconstruction

Figure 3 for Epipolar Focus Spectrum: A Novel Light Field Representation and Application in Dense-view Reconstruction

Figure 4 for Epipolar Focus Spectrum: A Novel Light Field Representation and Application in Dense-view Reconstruction

Existing light field representations, such as epipolar plane image (EPI) and sub-aperture images, do not consider the structural characteristics across the views, so they usually require additional disparity and spatial structure cues for follow-up tasks. Besides, they have difficulties dealing with occlusions or larger disparity scenes. To this end, this paper proposes a novel Epipolar Focus Spectrum (EFS) representation by rearranging the EPI spectrum. Different from the classical EPI representation where an EPI line corresponds to a specific depth, there is a one-to-one mapping from the EFS line to the view. Accordingly, compared to a sparsely-sampled light field, a densely-sampled one with the same field of view (FoV) leads to a more compact distribution of such linear structures in the double-cone-shaped region with the identical opening angle in its corresponding EFS. Hence the EFS representation is invariant to the scene depth. To demonstrate its effectiveness, we develop a trainable EFS-based pipeline for light field reconstruction, where a dense light field can be reconstructed by compensating the "missing EFS lines" given a sparse light field, yielding promising results with cross-view consistency, especially in the presence of severe occlusion and large disparity. Experimental results on both synthetic and real-world datasets demonstrate the validity and superiority of the proposed method over SOTA methods.

* Light field representation, Epipolar Focus Spectrum (EFS), Dense light field reconstruction, Depth independent, Frequency domain

Via

Access Paper or Ask Questions

Stereo Unstructured Magnification: Multiple Homography Image for View Synthesis

Apr 01, 2022
Qi Zhang, Xin Huang, Ying Feng, Xue Wang, Hongdong Li, Qing Wang

Figure 1 for Stereo Unstructured Magnification: Multiple Homography Image for View Synthesis

Figure 2 for Stereo Unstructured Magnification: Multiple Homography Image for View Synthesis

Figure 3 for Stereo Unstructured Magnification: Multiple Homography Image for View Synthesis

Figure 4 for Stereo Unstructured Magnification: Multiple Homography Image for View Synthesis

This paper studies the problem of view synthesis with certain amount of rotations from a pair of images, what we called stereo unstructured magnification. While the multi-plane image representation is well suited for view synthesis with depth invariant, how to generalize it to unstructured views remains a significant challenge. This is primarily due to the depth-dependency caused by camera frontal parallel representation. Here we propose a novel multiple homography image (MHI) representation, comprising of a set of scene planes with fixed normals and distances. A two-stage network is developed for novel view synthesis. Stage-1 is an MHI reconstruction module that predicts the MHIs and composites layered multi-normal images along the normal direction. Stage-2 is a normal-blending module to find blending weights. We also derive an angle-based cost to guide the blending of multi-normal images by exploiting per-normal geometry. Compared with the state-of-the-art methods, our method achieves superior performance for view synthesis qualitatively and quantitatively, especially for cases when the cameras undergo rotations.

Via

Access Paper or Ask Questions

A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification

Mar 31, 2022
Qing Wang, Jun Du, Siyuan Zheng, Yunqing Li, Yajian Wang, Yuzhong Wu, Hu Hu, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Yannan Wang, Chin-Hui Lee

Figure 1 for A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification

Figure 2 for A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification

Figure 3 for A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification

Figure 4 for A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification

In this paper, we propose two techniques, namely joint modeling and data augmentation, to improve system performances for audio-visual scene classification (AVSC). We employ pre-trained networks trained only on image data sets to extract video embedding; whereas for audio embedding models, we decide to train them from scratch. We explore different neural network architectures for joint modeling to effectively combine the video and audio modalities. Moreover, data augmentation strategies are investigated to increase audio-visual training set size. For the video modality the effectiveness of several operations in RandAugment is verified. An audio-video joint mixup scheme is proposed to further improve AVSC performances. Evaluated on the development set of TAU Urban Audio Visual Scenes 2021, our final system can achieve the best accuracy of 94.2% among all single AVSC systems submitted to DCASE 2021 Task 1b.

* 5 pages, 1 figure, submitted to INTERSPEECH 2022

Via

Access Paper or Ask Questions