Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yanlin Zhou

Regularizing Differentiable Architecture Search with Smooth Activation

Apr 22, 2025

Yanlin Zhou, Mostafa El-Khamy, Kee-Bong Song

Abstract:Differentiable Architecture Search (DARTS) is an efficient Neural Architecture Search (NAS) method but suffers from robustness, generalization, and discrepancy issues. Many efforts have been made towards the performance collapse issue caused by skip dominance with various regularization techniques towards operation weights, path weights, noise injection, and super-network redesign. It had become questionable at a certain point if there could exist a better and more elegant way to retract the search to its intended goal -- NAS is a selection problem. In this paper, we undertake a simple but effective approach, named Smooth Activation DARTS (SA-DARTS), to overcome skip dominance and discretization discrepancy challenges. By leveraging a smooth activation function on architecture weights as an auxiliary loss, our SA-DARTS mitigates the unfair advantage of weight-free operations, converging to fanned-out architecture weight values, and can recover the search process from skip-dominance initialization. Through theoretical and empirical analysis, we demonstrate that the SA-DARTS can yield new state-of-the-art (SOTA) results on NAS-Bench-201, classification, and super-resolution. Further, we show that SA-DARTS can help improve the performance of SOTA models with fewer parameters, such as Information Multi-distillation Network on the super-resolution task.

Via

Access Paper or Ask Questions

Heterogeneous Team Coordination on Partially Observable Graphs with Realistic Communication

Oct 29, 2024

Yanlin Zhou, Manshi Limbu, Xuan Wang, Daigo Shishika, Xuesu Xiao

Figure 1 for Heterogeneous Team Coordination on Partially Observable Graphs with Realistic Communication

Figure 2 for Heterogeneous Team Coordination on Partially Observable Graphs with Realistic Communication

Figure 3 for Heterogeneous Team Coordination on Partially Observable Graphs with Realistic Communication

Abstract:Team Coordination on Graphs with Risky Edges (\textsc{tcgre}) is a recently proposed problem, in which robots find paths to their goals while considering possible coordination to reduce overall team cost. However, \textsc{tcgre} assumes that the \emph{entire} environment is available to a \emph{homogeneous} robot team with \emph{ubiquitous} communication. In this paper, we study an extended version of \textsc{tcgre}, called \textsc{hpr-tcgre}, with three relaxations: Heterogeneous robots, Partial observability, and Realistic communication. To this end, we form a new combinatorial optimization problem on top of \textsc{tcgre}. After analysis, we divide it into two sub-problems, one for robots moving individually, another for robots in groups, depending on their communication availability. Then, we develop an algorithm that exploits real-time partial maps to solve local shortest path(s) problems, with a A*-like sub-goal(s) assignment mechanism that explores potential coordination opportunities for global interests. Extensive experiments indicate that our algorithm is able to produce team coordination behaviors in order to reduce overall cost even with our three relaxations.

* 7 pages, 4 figures

Via

Access Paper or Ask Questions

A Protein Structure Prediction Approach Leveraging Transformer and CNN Integration

Mar 08, 2024

Yanlin Zhou, Kai Tan, Xinyu Shen, Zheng He, Haotian Zheng

Abstract:Proteins are essential for life, and their structure determines their function. The protein secondary structure is formed by the folding of the protein primary structure, and the protein tertiary structure is formed by the bending and folding of the secondary structure. Therefore, the study of protein secondary structure is very helpful to the overall understanding of protein structure. Although the accuracy of protein secondary structure prediction has continuously improved with the development of machine learning and deep learning, progress in the field of protein structure prediction, unfortunately, remains insufficient to meet the large demand for protein information. Therefore, based on the advantages of deep learning-based methods in feature extraction and learning ability, this paper adopts a two-dimensional fusion deep neural network model, DstruCCN, which uses Convolutional Neural Networks (CCN) and a supervised Transformer protein language model for single-sequence protein structure prediction. The training features of the two are combined to predict the protein Transformer binding site matrix, and then the three-dimensional structure is reconstructed using energy minimization.

Via

Access Paper or Ask Questions

Construction and application of artificial intelligence crowdsourcing map based on multi-track GPS data

Feb 24, 2024

Yong Wang, Yanlin Zhou, Huan Ji, Zheng He, Xinyu Shen

Figure 1 for Construction and application of artificial intelligence crowdsourcing map based on multi-track GPS data

Figure 2 for Construction and application of artificial intelligence crowdsourcing map based on multi-track GPS data

Figure 3 for Construction and application of artificial intelligence crowdsourcing map based on multi-track GPS data

Figure 4 for Construction and application of artificial intelligence crowdsourcing map based on multi-track GPS data

Abstract:In recent years, the rapid development of high-precision map technology combined with artificial intelligence has ushered in a new development opportunity in the field of intelligent vehicles. High-precision map technology is an important guarantee for intelligent vehicles to achieve autonomous driving. However, due to the lack of research on high-precision map technology, it is difficult to rationally use this technology in the field of intelligent vehicles. Therefore, relevant researchers studied a fast and effective algorithm to generate high-precision GPS data from a large number of low-precision GPS trajectory data fusion, and generated several key data points to simplify the description of GPS trajectory, and realized the "crowdsourced update" model based on a large number of social vehicles for map data collection came into being. This kind of algorithm has the important significance to improve the data accuracy, reduce the measurement cost and reduce the data storage space. On this basis, this paper analyzes the implementation form of crowdsourcing map, so as to improve the various information data in the high-precision map according to the actual situation, and promote the high-precision map can be reasonably applied to the intelligent car.

Via

Access Paper or Ask Questions

Open Compound Domain Adaptation with Object Style Compensation for Semantic Segmentation

Sep 28, 2023

Tingliang Feng, Hao Shi, Xueyang Liu, Wei Feng, Liang Wan, Yanlin Zhou, Di Lin

Figure 1 for Open Compound Domain Adaptation with Object Style Compensation for Semantic Segmentation

Figure 2 for Open Compound Domain Adaptation with Object Style Compensation for Semantic Segmentation

Figure 3 for Open Compound Domain Adaptation with Object Style Compensation for Semantic Segmentation

Figure 4 for Open Compound Domain Adaptation with Object Style Compensation for Semantic Segmentation

Abstract:Many methods of semantic image segmentation have borrowed the success of open compound domain adaptation. They minimize the style gap between the images of source and target domains, more easily predicting the accurate pseudo annotations for target domain's images that train segmentation network. The existing methods globally adapt the scene style of the images, whereas the object styles of different categories or instances are adapted improperly. This paper proposes the Object Style Compensation, where we construct the Object-Level Discrepancy Memory with multiple sets of discrepancy features. The discrepancy features in a set capture the style changes of the same category's object instances adapted from target to source domains. We learn the discrepancy features from the images of source and target domains, storing the discrepancy features in memory. With this memory, we select appropriate discrepancy features for compensating the style information of the object instances of various categories, adapting the object styles to a unified style of source domain. Our method enables a more accurate computation of the pseudo annotations for target domain's images, thus yielding state-of-the-art results on different datasets.

* Accepted by NeurlPS2023

Via

Access Paper or Ask Questions

A Data-Driven Model with Hysteresis Compensation for I2RIS Robot

Mar 10, 2023

Mojtaba Esfandiari, Yanlin Zhou, Shervin Dehghani, Muhammad Hadi, Adnan Munawar, Henry Phalen, Peter Gehlbach, Russell H. Taylor, Iulian Iordachita

Figure 1 for A Data-Driven Model with Hysteresis Compensation for I2RIS Robot

Figure 2 for A Data-Driven Model with Hysteresis Compensation for I2RIS Robot

Figure 3 for A Data-Driven Model with Hysteresis Compensation for I2RIS Robot

Figure 4 for A Data-Driven Model with Hysteresis Compensation for I2RIS Robot

Abstract:Retinal microsurgery is a high-precision surgery performed on an exceedingly delicate tissue. It now requires extensively trained and highly skilled surgeons. Given the restricted range of instrument motion in the confined intraocular space, and also potentially restricting instrument contact with the sclera, snake-like robots may prove to be a promising technology to provide surgeons with greater flexibility, dexterity, space access, and positioning accuracy during retinal procedures requiring high precision and advantageous tooltip approach angles, such as retinal vein cannulation and epiretinal membrane peeling. Kinematics modeling of these robots is an essential step toward accurate position control, however, as opposed to conventional manipulators, modeling of these robots does not follow a straightforward method due to their complex mechanical structure and actuation mechanisms. Especially, in wire-driven snake-like robots, the hysteresis problem due to the wire tension condition can have a significant impact on the positioning accuracy of these robots. In this paper, we proposed an experimental kinematics model with a hysteresis compensation algorithm using the probabilistic Gaussian mixture models (GMM) Gaussian mixture regression (GMR) approach. Experimental results on the two-degree-of-freedom (DOF) integrated robotic intraocular snake (I2RIS) show that the proposed model provides 0.4 deg accuracy, which is an overall 60% and 70% of improvement for yaw and pitch degrees of freedom, respectively, compared to a previous model of this robot.

Via

Access Paper or Ask Questions

Server Averaging for Federated Learning

Mar 22, 2021

George Pu, Yanlin Zhou, Dapeng Wu, Xiaolin Li

Figure 1 for Server Averaging for Federated Learning

Figure 2 for Server Averaging for Federated Learning

Figure 3 for Server Averaging for Federated Learning

Abstract:Federated learning allows distributed devices to collectively train a model without sharing or disclosing the local dataset with a central server. The global model is optimized by training and averaging the model parameters of all local participants. However, the improved privacy of federated learning also introduces challenges including higher computation and communication costs. In particular, federated learning converges slower than centralized training. We propose the server averaging algorithm to accelerate convergence. Sever averaging constructs the shared global model by periodically averaging a set of previous global models. Our experiments indicate that server averaging not only converges faster, to a target accuracy, than federated averaging (FedAvg), but also reduces the computation costs on the client-level through epoch decay.

Via

Access Paper or Ask Questions

Distilled One-Shot Federated Learning

Sep 17, 2020

Yanlin Zhou, George Pu, Xiyao Ma, Xiaolin Li, Dapeng Wu

Figure 1 for Distilled One-Shot Federated Learning

Figure 2 for Distilled One-Shot Federated Learning

Figure 3 for Distilled One-Shot Federated Learning

Figure 4 for Distilled One-Shot Federated Learning

Abstract:Current federated learning algorithms take tens of communication rounds transmitting unwieldy model weights under ideal circumstances and hundreds when data is poorly distributed. Inspired by recent work on dataset distillation and distributed one-shot learning, we propose Distilled One-Shot Federated Learning, which reduces the number of communication rounds required to train a performant model to only one. Each client distills their private dataset and sends the synthetic data (e.g. images or sentences) to the server. The distilled data look like noise and become useless after model fitting. We empirically show that, in only one round of communication, our method can achieve 96% test accuracy on federated MNIST with LeNet (centralized 99%), 81% on federated IMDB with a customized CNN (centralized 86%), and 84% on federated TREC-6 with a Bi-LSTM (centralized 89%). Using only a few rounds, DOSFL can match the centralized baseline on all three tasks. By evading the need for model-wise updates (i.e., weights, gradients, loss, etc.), the total communication cost of DOSFL is reduced by over an order of magnitude. We believe that DOSFL represents a new direction orthogonal to previous work, towards weight-less and gradient-less federated learning.

Via

Access Paper or Ask Questions

Asking Complex Questions with Multi-hop Answer-focused Reasoning

Sep 16, 2020

Xiyao Ma, Qile Zhu, Yanlin Zhou, Xiaolin Li, Dapeng Wu

Figure 1 for Asking Complex Questions with Multi-hop Answer-focused Reasoning

Figure 2 for Asking Complex Questions with Multi-hop Answer-focused Reasoning

Figure 3 for Asking Complex Questions with Multi-hop Answer-focused Reasoning

Figure 4 for Asking Complex Questions with Multi-hop Answer-focused Reasoning

Abstract:Asking questions from natural language text has attracted increasing attention recently, and several schemes have been proposed with promising results by asking the right question words and copy relevant words from the input to the question. However, most state-of-the-art methods focus on asking simple questions involving single-hop relations. In this paper, we propose a new task called multihop question generation that asks complex and semantically relevant questions by additionally discovering and modeling the multiple entities and their semantic relations given a collection of documents and the corresponding answer 1. To solve the problem, we propose multi-hop answer-focused reasoning on the grounded answer-centric entity graph to include different granularity levels of semantic information including the word-level and document-level semantics of the entities and their semantic relations. Through extensive experiments on the HOTPOTQA dataset, we demonstrate the superiority and effectiveness of our proposed model that serves as a baseline to motivate future work.

Via

Access Paper or Ask Questions

Improving Question Generation with Sentence-level Semantic Matching and Answer Position Inferring

Feb 03, 2020

Xiyao Ma, Qile Zhu, Yanlin Zhou, Xiaolin Li, Dapeng Wu

Figure 1 for Improving Question Generation with Sentence-level Semantic Matching and Answer Position Inferring

Figure 2 for Improving Question Generation with Sentence-level Semantic Matching and Answer Position Inferring

Figure 3 for Improving Question Generation with Sentence-level Semantic Matching and Answer Position Inferring

Figure 4 for Improving Question Generation with Sentence-level Semantic Matching and Answer Position Inferring

Abstract:Taking an answer and its context as input, sequence-to-sequence models have made considerable progress on question generation. However, we observe that these approaches often generate wrong question words or keywords and copy answer-irrelevant words from the input. We believe that lacking global question semantics and exploiting answer position-awareness not well are the key root causes. In this paper, we propose a neural question generation model with two concrete modules: sentence-level semantic matching and answer position inferring. Further, we enhance the initial state of the decoder by leveraging the answer-aware gated fusion mechanism. Experimental results demonstrate that our model outperforms the state-of-the-art (SOTA) models on SQuAD and MARCO datasets. Owing to its generality, our work also improves the existing models significantly.

* Revised version of paper accepted to Thirty-fourth AAAI Conference on Artificial Intelligence

Via

Access Paper or Ask Questions