Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Han Wang

FedADMM: A Federated Primal-Dual Algorithm Allowing Partial Participation

Mar 28, 2022
Han Wang, Siddartha Marella, James Anderson

Figure 1 for FedADMM: A Federated Primal-Dual Algorithm Allowing Partial Participation

Figure 2 for FedADMM: A Federated Primal-Dual Algorithm Allowing Partial Participation

Federated learning is a framework for distributed optimization that places emphasis on communication efficiency. In particular, it follows a client-server broadcast model and is particularly appealing because of its ability to accommodate heterogeneity in client compute and storage resources, non-i.i.d. data assumptions, and data privacy. Our contribution is to offer a new federated learning algorithm, FedADMM, for solving non-convex composite optimization problems with non-smooth regularizers. We prove converges of FedADMM for the case when not all clients are able to participate in a given communication round under a very general sampling model.

Via

Access Paper or Ask Questions

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

Feb 02, 2022
Stephen H. Bach, Victor Sanh, Zheng-Xin Yong, Albert Webson, Colin Raffel, Nihal V. Nayak, Abheesht Sharma, Taewoon Kim, M Saiful Bari, Thibault Fevry, Zaid Alyafeai, Manan Dey, Andrea Santilli, Zhiqing Sun, Srulik Ben-David, Canwen Xu, Gunjan Chhablani, Han Wang, Jason Alan Fries, Maged S. Al-shaibani, Shanya Sharma, Urmish Thakker, Khalid Almubarak, Xiangru Tang, Mike Tian-Jian Jiang, Alexander M. Rush

Figure 1 for PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

Figure 2 for PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

Figure 3 for PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

Figure 4 for PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

PromptSource is a system for creating, sharing, and using natural language prompts. Prompts are functions that map an example from a dataset to a natural language input and target output. Using prompts to train and query language models is an emerging area in NLP that requires new tools that let users develop and refine these prompts collaboratively. PromptSource addresses the emergent challenges in this new setting with (1) a templating language for defining data-linked prompts, (2) an interface that lets users quickly iterate on prompt development by observing outputs of their prompts on many examples, and (3) a community-driven set of guidelines for contributing new prompts to a common pool. Over 2,000 prompts for roughly 170 datasets are already available in PromptSource. PromptSource is available at https://github.com/bigscience-workshop/promptsource.

Via

Access Paper or Ask Questions

Learning Linear Models Using Distributed Iterative Hessian Sketching

Dec 08, 2021
Han Wang, James Anderson

Figure 1 for Learning Linear Models Using Distributed Iterative Hessian Sketching

Figure 2 for Learning Linear Models Using Distributed Iterative Hessian Sketching

Figure 3 for Learning Linear Models Using Distributed Iterative Hessian Sketching

This work considers the problem of learning the Markov parameters of a linear system from observed data. Recent non-asymptotic system identification results have characterized the sample complexity of this problem in the single and multi-rollout setting. In both instances, the number of samples required in order to obtain acceptable estimates can produce optimization problems with an intractably large number of decision variables for a second-order algorithm. We show that a randomized and distributed Newton algorithm based on Hessian-sketching can produce $\epsilon$-optimal solutions and converges geometrically. Moreover, the algorithm is trivially parallelizable. Our results hold for a variety of sketching matrices and we illustrate the theory with numerical examples.

Via

Access Paper or Ask Questions

Decomposing Complex Questions Makes Multi-Hop QA Easier and More Interpretable

Oct 26, 2021
Ruiliu Fu, Han Wang, Xuejun Zhang, Jun Zhou, Yonghong Yan

Figure 1 for Decomposing Complex Questions Makes Multi-Hop QA Easier and More Interpretable

Figure 2 for Decomposing Complex Questions Makes Multi-Hop QA Easier and More Interpretable

Figure 3 for Decomposing Complex Questions Makes Multi-Hop QA Easier and More Interpretable

Figure 4 for Decomposing Complex Questions Makes Multi-Hop QA Easier and More Interpretable

Multi-hop QA requires the machine to answer complex questions through finding multiple clues and reasoning, and provide explanatory evidence to demonstrate the machine reasoning process. We propose Relation Extractor-Reader and Comparator (RERC), a three-stage framework based on complex question decomposition, which is the first work that the RERC model has been proposed and applied in solving the multi-hop QA challenges. The Relation Extractor decomposes the complex question, and then the Reader answers the sub-questions in turn, and finally the Comparator performs numerical comparison and summarizes all to get the final answer, where the entire process itself constitutes a complete reasoning evidence path. In the 2WikiMultiHopQA dataset, our RERC model has achieved the most advanced performance, with a winning joint F1 score of 53.58 on the leaderboard. All indicators of our RERC are close to human performance, with only 1.95 behind the human level in F1 score of support fact. At the same time, the evidence path provided by our RERC framework has excellent readability and faithfulness.

* Accepted to EMNLP2021 Findings Long Paper

Via

Access Paper or Ask Questions

Reminding the Incremental Language Model via Data-Free Self-Distillation

Oct 17, 2021
Han Wang, Ruiliu Fu, Chengzhang Li, Xuejun Zhang, Jun Zhou, Yonghong Yan

Figure 1 for Reminding the Incremental Language Model via Data-Free Self-Distillation

Figure 2 for Reminding the Incremental Language Model via Data-Free Self-Distillation

Figure 3 for Reminding the Incremental Language Model via Data-Free Self-Distillation

Figure 4 for Reminding the Incremental Language Model via Data-Free Self-Distillation

Incremental language learning with pseudo-data can alleviate catastrophic forgetting in neural networks. However, to obtain better performance, former methods have higher demands for pseudo-data of the previous tasks. The performance dramatically decreases when fewer pseudo-data are employed. In addition, the distribution of pseudo-data gradually deviates from the real data with the sequential learning of different tasks. The deviation will be greater with more tasks learned, which results in more serious catastrophic forgetting. To address these issues, we propose reminding incremental language model via data-free self-distillation (DFSD), which includes self-distillation based on the Earth Mover's Distance and hidden data augmentation. By estimating the knowledge distribution in all layers of GPT-2 and transforming it from teacher model to student model, the Self-distillation based on the Earth Mover's Distance can significantly reduce the demand for pseudo-data. Hidden data augmentation can greatly alleviate the catastrophic forgetting caused by deviations via modeling the generation of pseudo-data as a hidden data augmentation process, where each sample is a mixture of all trained task data. The experimental results demonstrate that our DFSD can exceed the previous state-of-the-art methods even if the maximum decrease in pseudo-data is 90%.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions

Multitask Prompted Training Enables Zero-Shot Task Generalization

Oct 15, 2021
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Stella Biderman, Leo Gao, Tali Bers, Thomas Wolf, Alexander M. Rush

Figure 1 for Multitask Prompted Training Enables Zero-Shot Task Generalization

Figure 2 for Multitask Prompted Training Enables Zero-Shot Task Generalization

Figure 3 for Multitask Prompted Training Enables Zero-Shot Task Generalization

Figure 4 for Multitask Prompted Training Enables Zero-Shot Task Generalization

Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks. It has been hypothesized that this is a consequence of implicit multitask learning in language model training. Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping general natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts using varying natural language. These prompted datasets allow for benchmarking the ability of a model to perform completely unseen tasks specified in natural language. We fine-tune a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several standard datasets, often outperforming models 16x its size. Further, our approach attains strong performance on a subset of tasks from the BIG-Bench benchmark, outperforming models 6x its size. All prompts and trained models are available at github.com/bigscience-workshop/promptsource/.

* https://github.com/bigscience-workshop/promptsource/

Via

Access Paper or Ask Questions

Robust Glare Detection: Review, Analysis, and Dataset Release

Oct 13, 2021
Mahdi Abolfazli Esfahani, Han Wang

Figure 1 for Robust Glare Detection: Review, Analysis, and Dataset Release

Figure 2 for Robust Glare Detection: Review, Analysis, and Dataset Release

Figure 3 for Robust Glare Detection: Review, Analysis, and Dataset Release

Figure 4 for Robust Glare Detection: Review, Analysis, and Dataset Release

Sun Glare widely exists in the images captured by unmanned ground and aerial vehicles performing in outdoor environments. The existence of such artifacts in images will result in wrong feature extraction and failure of autonomous systems. Humans will try to adapt their view once they observe a glare (especially when driving), and this behavior is an essential requirement for the next generation of autonomous vehicles. The source of glare is not limited to the sun, and glare can be seen in the images captured during the nighttime and in indoor environments, which is due to the presence of different light sources; reflective surfaces also influence the generation of such artifacts. The glare's visual characteristics are different on images captured by various cameras and depend on several factors such as the camera's shutter speed and exposure level. Hence, it is challenging to introduce a general - robust and accurate - algorithm for glare detection that can perform well in various captured images. This research aims to introduce the first dataset for glare detection, which includes images captured by different cameras. Besides, the effect of multiple image representations and their combination in glare detection is examined using the proposed deep network architecture. The released dataset is available at https://github.com/maesfahani/glaredetection

Via

Access Paper or Ask Questions

Large-Scale System Identification Using a Randomized SVD

Sep 06, 2021
Han Wang, James Anderson

Figure 1 for Large-Scale System Identification Using a Randomized SVD

Figure 2 for Large-Scale System Identification Using a Randomized SVD

Figure 3 for Large-Scale System Identification Using a Randomized SVD

Learning a dynamical system from input/output data is a fundamental task in the control design pipeline. In the partially observed setting there are two components to identification: parameter estimation to learn the Markov parameters, and system realization to obtain a state space model. In both sub-problems it is implicitly assumed that standard numerical algorithms such as the singular value decomposition (SVD) can be easily and reliably computed. When trying to fit a high-dimensional model to data, for example in the cyber-physical system setting, even computing an SVD is intractable. In this work we show that an approximate matrix factorization obtained using randomized methods can replace the standard SVD in the realization algorithm while maintaining the non-asymptotic (in data-set size) performance and robustness guarantees of classical methods. Numerical examples illustrate that for large system models, this is the only method capable of producing a model.

Via

Access Paper or Ask Questions

Visual Enhanced 3D Point Cloud Reconstruction from A Single Image

Aug 17, 2021
Guiju Ping, Mahdi Abolfazli Esfahani, Han Wang

Figure 1 for Visual Enhanced 3D Point Cloud Reconstruction from A Single Image

Figure 2 for Visual Enhanced 3D Point Cloud Reconstruction from A Single Image

Figure 3 for Visual Enhanced 3D Point Cloud Reconstruction from A Single Image

Figure 4 for Visual Enhanced 3D Point Cloud Reconstruction from A Single Image

Solving the challenging problem of 3D object reconstruction from a single image appropriately gives existing technologies the ability to perform with a single monocular camera rather than requiring depth sensors. In recent years, thanks to the development of deep learning, 3D reconstruction of a single image has demonstrated impressive progress. Existing researches use Chamfer distance as a loss function to guide the training of the neural network. However, the Chamfer loss will give equal weights to all points inside the 3D point clouds. It tends to sacrifice fine-grained and thin structures to avoid incurring a high loss, which will lead to visually unsatisfactory results. This paper proposes a framework that can recover a detailed three-dimensional point cloud from a single image by focusing more on boundaries (edge and corner points). Experimental results demonstrate that the proposed method outperforms existing techniques significantly, both qualitatively and quantitatively, and has fewer training parameters.

* 8 pages

Via

Access Paper or Ask Questions

F-LOAM: Fast LiDAR Odometry And Mapping

Jul 02, 2021
Han Wang, Chen Wang, Chun-Lin Chen, Lihua Xie

Figure 1 for F-LOAM: Fast LiDAR Odometry And Mapping

Figure 2 for F-LOAM: Fast LiDAR Odometry And Mapping

Figure 3 for F-LOAM: Fast LiDAR Odometry And Mapping

Figure 4 for F-LOAM: Fast LiDAR Odometry And Mapping

Simultaneous Localization and Mapping (SLAM) has wide robotic applications such as autonomous driving and unmanned aerial vehicles. Both computational efficiency and localization accuracy are of great importance towards a good SLAM system. Existing works on LiDAR based SLAM often formulate the problem as two modules: scan-to-scan match and scan-to-map refinement. Both modules are solved by iterative calculation which are computationally expensive. In this paper, we propose a general solution that aims to provide a computationally efficient and accurate framework for LiDAR based SLAM. Specifically, we adopt a non-iterative two-stage distortion compensation method to reduce the computational cost. For each scan input, the edge and planar features are extracted and matched to a local edge map and a local plane map separately, where the local smoothness is also considered for iterative pose optimization. Thorough experiments are performed to evaluate its performance in challenging scenarios, including localization for a warehouse Automated Guided Vehicle (AGV) and a public dataset on autonomous driving. The proposed method achieves a competitive localization accuracy with a processing rate of more than 10 Hz in the public dataset evaluation, which provides a good trade-off between performance and computational cost for practical applications.

Via

Access Paper or Ask Questions