Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ang Li

Knowledge Graph and Accurate Portrait Construction of Scientific and Technological Academic Conferences

Apr 11, 2022

Runyu Yu, Zhe Xue, Ang Li

Abstract:In recent years, with the continuous progress of science and technology, the number of scientific research achievements is increasing day by day, as the exchange platform and medium of scientific research achievements, the scientific and technological academic conferences have become more and more abundant. The convening of scientific and technological academic conferences will bring large number of academic papers, researchers, research institutions and other data, and the massive data brings difficulties for researchers to obtain valuable information. Therefore, it is of great significance to use deep learning technology to mine the core information in the data of scientific and technological academic conferences, and to realize a knowledge graph and accurate portrait system of scientific and technological academic conferences, so that researchers can obtain scientific research information faster.

* 14 pages

Via

Access Paper or Ask Questions

Research on Cross-media Science and Technology Information Data Retrieval

Apr 11, 2022

Yang Jiang, Zhe Xue, Ang Li

Abstract:Since the era of big data, the Internet has been flooded with all kinds of information. Browsing information through the Internet has become an integral part of people's daily life. Unlike the news data and social data in the Internet, the cross-media technology information data has different characteristics. This data has become an important basis for researchers and scholars to track the current hot spots and explore the future direction of technology development. As the volume of science and technology information data becomes richer, the traditional science and technology information retrieval system, which only supports unimodal data retrieval and uses outdated data keyword matching model, can no longer meet the daily retrieval needs of science and technology scholars. Therefore, in view of the above research background, it is of profound practical significance to study the cross-media science and technology information data retrieval system based on deep semantic features, which is in line with the development trend of domestic and international technologies.

* 7 pages

Via

Access Paper or Ask Questions

Accurate Portraits of Scientific Resources and Knowledge Service Components

Apr 11, 2022

Yue Wang, Zhe Xue, Ang Li

Abstract:With the advent of the cloud computing era, the cost of creating, capturing and managing information has gradually decreased. The amount of data in the Internet is also showing explosive growth, and more and more scientific and technological resources are uploaded to the network. Different from news and social media data ubiquitous in the Internet, the main body of scientific and technological resources is composed of academic-style resources or entities such as papers, patents, authors, and research institutions. There is a rich relationship network between resources, from which a large amount of cutting-edge scientific and technological information can be mined. There are a large number of management and classification standards for existing scientific and technological resources, but these standards are difficult to completely cover all entities and associations of scientific and technological resources, and cannot accurately extract important information contained in scientific and technological resources. How to construct a complete and accurate representation of scientific and technological resources from structured and unstructured reports and texts in the network, and how to tap the potential value of scientific and technological resources is an urgent problem. The solution is to construct accurate portraits of scientific and technological resources in combination with knowledge graph related technologies.

* 9 pages

Via

Access Paper or Ask Questions

Information-theoretic Online Memory Selection for Continual Learning

Apr 10, 2022

Shengyang Sun, Daniele Calandriello, Huiyi Hu, Ang Li, Michalis Titsias

Figure 1 for Information-theoretic Online Memory Selection for Continual Learning

Figure 2 for Information-theoretic Online Memory Selection for Continual Learning

Figure 3 for Information-theoretic Online Memory Selection for Continual Learning

Figure 4 for Information-theoretic Online Memory Selection for Continual Learning

Abstract:A challenging problem in task-free continual learning is the online selection of a representative replay memory from data streams. In this work, we investigate the online memory selection problem from an information-theoretic perspective. To gather the most information, we propose the \textit{surprise} and the \textit{learnability} criteria to pick informative points and to avoid outliers. We present a Bayesian model to compute the criteria efficiently by exploiting rank-one matrix structures. We demonstrate that these criteria encourage selecting informative points in a greedy algorithm for online memory selection. Furthermore, by identifying the importance of \textit{the timing to update the memory}, we introduce a stochastic information-theoretic reservoir sampler (InfoRS), which conducts sampling among selective points with high information. Compared to reservoir sampling, InfoRS demonstrates improved robustness against data imbalance. Finally, empirical performances over continual learning benchmarks manifest its efficiency and efficacy.

* ICLR 2022

Via

Access Paper or Ask Questions

FastMapSVM: Classifying Complex Objects Using the FastMap Algorithm and Support-Vector Machines

Apr 07, 2022

Malcolm C. A. White, Kushal Sharma, Ang Li, T. K. Satish Kumar, Nori Nakata

Figure 1 for FastMapSVM: Classifying Complex Objects Using the FastMap Algorithm and Support-Vector Machines

Figure 2 for FastMapSVM: Classifying Complex Objects Using the FastMap Algorithm and Support-Vector Machines

Figure 3 for FastMapSVM: Classifying Complex Objects Using the FastMap Algorithm and Support-Vector Machines

Figure 4 for FastMapSVM: Classifying Complex Objects Using the FastMap Algorithm and Support-Vector Machines

Abstract:Neural Networks and related Deep Learning methods are currently at the leading edge of technologies used for classifying objects. However, they generally demand large amounts of time and data for model training; and their learned models can sometimes be difficult to interpret. In this paper, we present FastMapSVM, a novel interpretable Machine Learning framework for classifying complex objects. FastMapSVM combines the strengths of FastMap and Support-Vector Machines. FastMap is an efficient linear-time algorithm that maps complex objects to points in a Euclidean space, while preserving pairwise non-Euclidean distances between them. We demonstrate the efficiency and effectiveness of FastMapSVM in the context of classifying seismograms. We show that its performance, in terms of precision, recall, and accuracy, is comparable to that of other state-of-the-art methods. However, compared to other methods, FastMapSVM uses significantly smaller amounts of time and data for model training. It also provides a perspicuous visualization of the objects and the classification boundaries between them. We expect FastMapSVM to be viable for classification tasks in many other real-world domains.

* 27 pages, 12 figures

Via

Access Paper or Ask Questions

Scientific and Technological Text Knowledge Extraction Method of based on Word Mixing and GRU

Mar 31, 2022

Suyu Ouyang, Yingxia Shao, Junping Du, Ang Li

Figure 1 for Scientific and Technological Text Knowledge Extraction Method of based on Word Mixing and GRU

Figure 2 for Scientific and Technological Text Knowledge Extraction Method of based on Word Mixing and GRU

Figure 3 for Scientific and Technological Text Knowledge Extraction Method of based on Word Mixing and GRU

Figure 4 for Scientific and Technological Text Knowledge Extraction Method of based on Word Mixing and GRU

Abstract:The knowledge extraction task is to extract triple relations (head entity-relation-tail entity) from unstructured text data. The existing knowledge extraction methods are divided into "pipeline" method and joint extraction method. The "pipeline" method is to separate named entity recognition and entity relationship extraction and use their own modules to extract them. Although this method has better flexibility, the training speed is slow. The learning model of joint extraction is an end-to-end model implemented by neural network to realize entity recognition and relationship extraction at the same time, which can well preserve the association between entities and relationships, and convert the joint extraction of entities and relationships into a sequence annotation problem. In this paper, we propose a knowledge extraction method for scientific and technological resources based on word mixture and GRU, combined with word mixture vector mapping method and self-attention mechanism, to effectively improve the effect of text relationship extraction for Chinese scientific and technological resources.

* 8 pages,2 figures

Via

Access Paper or Ask Questions

Towards Collaborative Intelligence: Routability Estimation based on Decentralized Private Data

Mar 30, 2022

Jingyu Pan, Chen-Chia Chang, Zhiyao Xie, Ang Li, Minxue Tang, Tunhou Zhang, Jiang Hu, Yiran Chen

Figure 1 for Towards Collaborative Intelligence: Routability Estimation based on Decentralized Private Data

Figure 2 for Towards Collaborative Intelligence: Routability Estimation based on Decentralized Private Data

Figure 3 for Towards Collaborative Intelligence: Routability Estimation based on Decentralized Private Data

Figure 4 for Towards Collaborative Intelligence: Routability Estimation based on Decentralized Private Data

Abstract:Applying machine learning (ML) in design flow is a popular trend in EDA with various applications from design quality predictions to optimizations. Despite its promise, which has been demonstrated in both academic researches and industrial tools, its effectiveness largely hinges on the availability of a large amount of high-quality training data. In reality, EDA developers have very limited access to the latest design data, which is owned by design companies and mostly confidential. Although one can commission ML model training to a design company, the data of a single company might be still inadequate or biased, especially for small companies. Such data availability problem is becoming the limiting constraint on future growth of ML for chip design. In this work, we propose an Federated-Learning based approach for well-studied ML applications in EDA. Our approach allows an ML model to be collaboratively trained with data from multiple clients but without explicit access to the data for respecting their data privacy. To further strengthen the results, we co-design a customized ML model FLNet and its personalization under the decentralized training scenario. Experiments on a comprehensive dataset show that collaborative training improves accuracy by 11% compared with individual local models, and our customized model FLNet significantly outperforms the best of previous routability estimators in this collaborative training flow.

* 6 pages, 2 figures, accepted by DAC'22

Via

Access Paper or Ask Questions

BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling

Mar 26, 2022

Cheng Wan, Youjie Li, Ang Li, Nam Sung Kim, Yingyan Lin

Figure 1 for BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling

Figure 2 for BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling

Figure 3 for BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling

Figure 4 for BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling

Abstract:Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art method for graph-based learning tasks. However, training GCNs at scale is still challenging, hindering both the exploration of more sophisticated GCN architectures and their applications to real-world large graphs. While it might be natural to consider graph partition and distributed training for tackling this challenge, this direction has only been slightly scratched the surface in the previous works due to the limitations of existing designs. In this work, we first analyze why distributed GCN training is ineffective and identify the underlying cause to be the excessive number of boundary nodes of each partitioned subgraph, which easily explodes the memory and communication costs for GCN training. Furthermore, we propose a simple yet effective method dubbed BNS-GCN that adopts random Boundary-Node-Sampling to enable efficient and scalable distributed GCN training. Experiments and ablation studies consistently validate the effectiveness of BNS-GCN, e.g., boosting the throughput by up to 16.2x and reducing the memory usage by up to 58%, while maintaining a full-graph accuracy. Furthermore, both theoretical and empirical analysis show that BNS-GCN enjoys a better convergence than existing sampling-based methods. We believe that our BNS-GCN has opened up a new paradigm for enabling GCN training at scale. The code is available at https://github.com/RICE-EIC/BNS-GCN.

* MLSys 2022

Via

Access Paper or Ask Questions

Block-Level Interference Exploitation Precoding without Symbol-by-Symbol Optimization

Mar 23, 2022

Ang Li, Chao Shen, Xuewen Liao, Christos Masouros, A. Lee Swindlehurst

Figure 1 for Block-Level Interference Exploitation Precoding without Symbol-by-Symbol Optimization

Figure 2 for Block-Level Interference Exploitation Precoding without Symbol-by-Symbol Optimization

Figure 3 for Block-Level Interference Exploitation Precoding without Symbol-by-Symbol Optimization

Abstract:Symbol-level precoding (SLP) based on the concept of constructive interference (CI) is shown to be superior to traditional block-level precoding (BLP), however at the cost of a symbol-by-symbol optimization during the precoding design. In this paper, we propose a CI-based block-level precoding (CI-BLP) scheme for the downlink transmission of a multi-user multiple-input single-output (MU-MISO) communication system, where we design a constant precoding matrix to a block of symbol slots to exploit CI for each symbol slot simultaneously. A single optimization problem is formulated to maximize the minimum CI effect over the entire block, thus reducing the computational cost of traditional SLP as the optimization problem only needs to be solved once per block. By leveraging the Karush-Kuhn-Tucker (KKT) conditions and the dual problem formulation, the original optimization problem is finally shown to be equivalent to a quadratic programming (QP) over a simplex. Numerical results validate our derivations and exhibit superior performance for the proposed CI-BLP scheme over traditional BLP and SLP methods, thanks to the relaxed block-level power constraint.

* arXiv admin note: substantial text overlap with arXiv:2202.09830

Via

Access Paper or Ask Questions

Academic Resource Text Level Multi-label Classification based on Attention

Mar 21, 2022

Yue Wang, Yawen Li, Ang Li

Figure 1 for Academic Resource Text Level Multi-label Classification based on Attention

Figure 2 for Academic Resource Text Level Multi-label Classification based on Attention

Figure 3 for Academic Resource Text Level Multi-label Classification based on Attention

Abstract:Hierarchical multi-label academic text classification (HMTC) is to assign academic texts into a hierarchically structured labeling system. We propose an attention-based hierarchical multi-label classification algorithm of academic texts (AHMCA) by integrating features such as text, keywords, and hierarchical structure, the academic documents are classified into the most relevant categories. We utilize word2vec and BiLSTM to obtain embedding and latent vector representations of text, keywords, and hierarchies. We use hierarchical attention mechanism to capture the associations between keywords, label hierarchies, and text word vectors to generate hierarchical-specific document embedding vectors to replace the original text embeddings in HMCN-F. The experimental results on the academic text dataset demonstrate the effectiveness of the AHMCA algorithm.

Via

Access Paper or Ask Questions