Alert button
Picture for Shuhao Zhang

Shuhao Zhang

Alert button

Harnessing Scalable Transactional Stream Processing for Managing Large Language Models [Vision]

Jul 17, 2023
Shuhao Zhang, Xianzhi Zeng, Yuhao Wu, Zhonghao Yang

Large Language Models (LLMs) have demonstrated extraordinary performance across a broad array of applications, from traditional language processing tasks to interpreting structured sequences like time-series data. Yet, their effectiveness in fast-paced, online decision-making environments requiring swift, accurate, and concurrent responses poses a significant challenge. This paper introduces TStreamLLM, a revolutionary framework integrating Transactional Stream Processing (TSP) with LLM management to achieve remarkable scalability and low latency. By harnessing the scalability, consistency, and fault tolerance inherent in TSP, TStreamLLM aims to manage continuous & concurrent LLM updates and usages efficiently. We showcase its potential through practical use cases like real-time patient monitoring and intelligent traffic management. The exploration of synergies between TSP and LLM management can stimulate groundbreaking developments in AI and database research. This paper provides a comprehensive overview of challenges and opportunities in this emerging field, setting forth a roadmap for future exploration and development.

Viaarxiv icon

SVDE: Scalable Value-Decomposition Exploration for Cooperative Multi-Agent Reinforcement Learning

Mar 16, 2023
Shuhan Qi, Shuhao Zhang, Qiang Wang, Jiajia Zhang, Jing Xiao, Xuan Wang

Figure 1 for SVDE: Scalable Value-Decomposition Exploration for Cooperative Multi-Agent Reinforcement Learning
Figure 2 for SVDE: Scalable Value-Decomposition Exploration for Cooperative Multi-Agent Reinforcement Learning
Figure 3 for SVDE: Scalable Value-Decomposition Exploration for Cooperative Multi-Agent Reinforcement Learning
Figure 4 for SVDE: Scalable Value-Decomposition Exploration for Cooperative Multi-Agent Reinforcement Learning

Value-decomposition methods, which reduce the difficulty of a multi-agent system by decomposing the joint state-action space into local observation-action spaces, have become popular in cooperative multi-agent reinforcement learning (MARL). However, value-decomposition methods still have the problems of tremendous sample consumption for training and lack of active exploration. In this paper, we propose a scalable value-decomposition exploration (SVDE) method, which includes a scalable training mechanism, intrinsic reward design, and explorative experience replay. The scalable training mechanism asynchronously decouples strategy learning with environmental interaction, so as to accelerate sample generation in a MapReduce manner. For the problem of lack of exploration, an intrinsic reward design and explorative experience replay are proposed, so as to enhance exploration to produce diverse samples and filter non-novel samples, respectively. Empirically, our method achieves the best performance on almost all maps compared to other popular algorithms in a set of StarCraft II micromanagement games. A data-efficiency experiment also shows the acceleration of SVDE for sample collection and policy convergence, and we demonstrate the effectiveness of factors in SVDE through a set of ablation experiments.

* 13 pages, 9 figures 
Viaarxiv icon

MolMiner: You only look once for chemical structure recognition

May 23, 2022
Youjun Xu, Jinchuan Xiao, Chia-Han Chou, Jianhang Zhang, Jintao Zhu, Qiwan Hu, Hemin Li, Ningsheng Han, Bingyu Liu, Shuaipeng Zhang, Jinyu Han, Zhen Zhang, Shuhao Zhang, Weilin Zhang, Luhua Lai, Jianfeng Pei

Figure 1 for MolMiner: You only look once for chemical structure recognition
Figure 2 for MolMiner: You only look once for chemical structure recognition
Figure 3 for MolMiner: You only look once for chemical structure recognition
Figure 4 for MolMiner: You only look once for chemical structure recognition

Molecular structures are always depicted as 2D printed form in scientific documents like journal papers and patents. However, these 2D depictions are not machine-readable. Due to a backlog of decades and an increasing amount of these printed literature, there is a high demand for the translation of printed depictions into machine-readable formats, which is known as Optical Chemical Structure Recognition (OCSR). Most OCSR systems developed over the last three decades follow a rule-based approach where the key step of vectorization of the depiction is based on the interpretation of vectors and nodes as bonds and atoms. Here, we present a practical software MolMiner, which is primarily built up using deep neural networks originally developed for semantic segmentation and object detection to recognize atom and bond elements from documents. These recognized elements can be easily connected as a molecular graph with distance-based construction algorithm. We carefully evaluate our software on four benchmark datasets with the state-of-the-art performance. Various real application scenarios are also tested, yielding satisfactory outcomes. The free download links of Mac and Windows versions are available: Mac: https://molminer-cdn.iipharma.cn/pharma-mind/artifact/latest/mac/PharmaMind-mac-latest-setup.dmg and Windows: https://molminer-cdn.iipharma.cn/pharma-mind/artifact/latest/win/PharmaMind-win-latest-setup.exe

* 19 pages, 4 figures 
Viaarxiv icon

Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning

May 11, 2022
Shuhan Qi, Shuhao Zhang, Xiaohan Hou, Jiajia Zhang, Xuan Wang, Jing Xiao

Figure 1 for Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning
Figure 2 for Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning
Figure 3 for Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning
Figure 4 for Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning for incomplete information environments has attracted extensive attention from researchers. However, due to the slow sample collection and poor sample exploration, there are still some problems in multi-agent reinforcement learning, such as unstable model iteration and low training efficiency. Moreover, most of the existing distributed framework are proposed for single-agent reinforcement learning and not suitable for multi-agent. In this paper, we design an distributed MARL framework based on the actor-work-learner architecture. In this framework, multiple asynchronous environment interaction modules can be deployed simultaneously, which greatly improves the sample collection speed and sample diversity. Meanwhile, to make full use of computing resources, we decouple the model iteration from environment interaction, and thus accelerate the policy iteration. Finally, we verified the effectiveness of propose framework in MaCA military simulation environment and the SMAC 3D realtime strategy gaming environment with imcomplete information characteristics.

* 9 pages, 20 figures 
Viaarxiv icon

Safety-guaranteed trajectory planning and control based on GP estimation for unmanned surface vessels

May 10, 2022
Shuhao Zhang, Yujia Yang, Seth Siriya, Ye Pu

Figure 1 for Safety-guaranteed trajectory planning and control based on GP estimation for unmanned surface vessels
Figure 2 for Safety-guaranteed trajectory planning and control based on GP estimation for unmanned surface vessels
Figure 3 for Safety-guaranteed trajectory planning and control based on GP estimation for unmanned surface vessels
Figure 4 for Safety-guaranteed trajectory planning and control based on GP estimation for unmanned surface vessels

We propose a safety-guaranteed planning and control framework for unmanned surface vessels (USVs), using Gaussian processes (GPs) to learn uncertainties. The uncertainties encountered by USVs, including external disturbances and model mismatches, are potentially state-dependent, time-varying, and hard to capture with constant models. GP is a powerful learning-based tool that can be integrated with a model-based planning and control framework, which employs a Hamilton-Jacobi differential game formulation. Such a combination yields less conservative trajectories and safety-guaranteeing control strategies. We demonstrate the proposed framework in simulations and experiments on a CLEARPATH Heron USV.

Viaarxiv icon

A Framework for Fast Polarity Labelling of Massive Data Streams

Mar 23, 2022
Huilin Wu, Mian Lu, Zhao Zheng, Shuhao Zhang

Figure 1 for A Framework for Fast Polarity Labelling of Massive Data Streams
Figure 2 for A Framework for Fast Polarity Labelling of Massive Data Streams
Figure 3 for A Framework for Fast Polarity Labelling of Massive Data Streams
Figure 4 for A Framework for Fast Polarity Labelling of Massive Data Streams

Many of the existing sentiment analysis techniques are based on supervised learning, and they demand the availability of valuable training datasets to train their models. When dataset freshness is critical, the annotating of high speed unlabelled data streams becomes critical but remains an open problem. In this paper, we propose PLStream, a novel Apache Flink-based framework for fast polarity labelling of massive data streams, like Twitter tweets or online product reviews. We address the associated implementation challenges and propose a list of techniques including both algorithmic improvements and system optimizations. A thorough empirical validation with two real-world workloads demonstrates that PLStream is able to generate high quality labels (almost 80% accuracy) in the presence of high-speed continuous unlabelled data streams (almost 16,000 tuples/sec) without any manual efforts.

Viaarxiv icon