Abstract:Generating accurate SQL for user queries (text-to-SQL) is a long-standing problem since the generation of the SQL requires comprehending the query and database and retrivale the accurate data from the database accordingly. Existing models rely on the comprehensive ability of Large Language Models (LLMs) to generate the SQL according to the database schema. However, there is some necessary knowledge that is not explicitly included in the database schema or has been learned by LLMs. Thus, the generated SQL of the knowledge-insufficient queries may be inaccurate, which negatively impacts the robustness of the text-to-SQL models. To deal with this situation, we propose the Knowledge-to-SQL framework, which employs tailored Data Expert LLM (DELLM) to provide helpful knowledge for all types of text-to-SQL models. Specifically, we provide the detailed design of DELLM, in terms of table reading, and the basic fine-tuning process. We further provide a Reinforcement Learning via Database Feedback (RLDBF) training strategy to guide the DELLM to generate more helpful knowledge for LLMs. Extensive experiments verify DELLM can enhance the state-of-the-art LLMs on text-to-SQL tasks. The model structure and the parameter weight of DELLM are released for further research.
Abstract:Representing the information of multiple behaviors in the single graph collaborative filtering (CF) vector has been a long-standing challenge. This is because different behaviors naturally form separate behavior graphs and learn separate CF embeddings. Existing models merge the separate embeddings by appointing the CF embeddings for some behaviors as the primary embedding and utilizing other auxiliaries to enhance the primary embedding. However, this approach often results in the joint embedding performing well on the main tasks but poorly on the auxiliary ones. To address the problem arising from the separate behavior graphs, we propose the concept of Partial Order Graphs (POG). POG defines the partial order relation of multiple behaviors and models behavior combinations as weighted edges to merge separate behavior graphs into a joint POG. Theoretical proof verifies that POG can be generalized to any given set of multiple behaviors. Based on POG, we propose the tailored Partial Order Graph Convolutional Networks (POGCN) that convolute neighbors' information while considering the behavior relations between users and items. POGCN also introduces a partial-order BPR sampling strategy for efficient and effective multiple-behavior CF training. POGCN has been successfully deployed on the homepage of Alibaba for two months, providing recommendation services for over one billion users. Extensive offline experiments conducted on three public benchmark datasets demonstrate that POGCN outperforms state-of-the-art multi-behavior baselines across all types of behaviors. Furthermore, online A/B tests confirm the superiority of POGCN in billion-scale recommender systems.
Abstract:Predicting Click-Through Rate (CTR) in billion-scale recommender systems poses a long-standing challenge for Graph Neural Networks (GNNs) due to the overwhelming computational complexity involved in aggregating billions of neighbors. To tackle this, GNN-based CTR models usually sample hundreds of neighbors out of the billions to facilitate efficient online recommendations. However, sampling only a small portion of neighbors results in a severe sampling bias and the failure to encompass the full spectrum of user or item behavioral patterns. To address this challenge, we name the conventional user-item recommendation graph as "micro recommendation graph" and introduce a more suitable MAcro Recommendation Graph (MAG) for billion-scale recommendations. MAG resolves the computational complexity problems in the infrastructure by reducing the node count from billions to hundreds. Specifically, MAG groups micro nodes (users and items) with similar behavior patterns to form macro nodes. Subsequently, we introduce tailored Macro Graph Neural Networks (MacGNN) to aggregate information on a macro level and revise the embeddings of macro nodes. MacGNN has already served Taobao's homepage feed for two months, providing recommendations for over one billion users. Extensive offline experiments on three public benchmark datasets and an industrial dataset present that MacGNN significantly outperforms twelve CTR baselines while remaining computationally efficient. Besides, online A/B tests confirm MacGNN's superiority in billion-scale recommender systems.
Abstract:This work presents an innovative solution for robotic odometry, path planning and exploration in wild unknown environments, focusing on digital modelling. The approach uses a minimum cost formulation with pseudo-randomly generated objectives, integrating multi-path planning and evaluation, with emphasis on full coverage of unknown maps based on feasible boundaries of interest. The evaluation carried out on a robotic platform with a lightweight 3D LiDAR sensor model, assesses the consistency and efficiency in exploring completely unknown subterranean-like areas. The algorithm allows for dynamic changes to the desired target and behaviour. At the same time, the paper details the design of AREX, highlighting its robust localisation, mapping and efficient exploration target selection capabilities, with a focus on continuity in exploration direction for increased efficiency and reduced odometry errors. The real-time, high-precision environmental perception module is identified as critical for accurate obstacle avoidance and exploration boundary identification.
Abstract:Generative Large Language Models (LLMs), such as ChatGPT, offer interactive APIs that can answer common questions at a human-expert level. However, these models often give inaccurate or incorrect responses when faced with questions requiring domain-specific or professional-specific knowledge not covered in their training corpus. Furthermore, many state-of-the-art LLMs are not open-source, making it challenging to inject knowledge with model APIs only. In this work, we introduce KnowGPT, a black-box knowledge injection framework for LLMs in question answering. KnowGPT leverages deep reinforcement learning (RL) to extract relevant knowledge from Knowledge Graphs (KGs) and use Multi-Armed Bandit (MAB) to construct the most suitable prompt for each question. Our extensive experiments on three benchmark datasets showcase that KnowGPT significantly enhances the existing methods. Notably, KnowGPT achieves an average improvement of 23.7% over ChatGPT and an average improvement of 2.9% over GPT-4. Additionally, KnowGPT attains a 91.6% accuracy on the OpenbookQA official leaderboard, which is comparable to human-level performance.
Abstract:In real Mobility-on-Demand (MoD) systems, demand is subject to high and dynamic volatility, which is difficult to predict by conventional time-series forecasting approaches. Most existing forecasting approaches yield the point value as the prediction result, which ignores the uncertainty that exists in the forecasting result. This will lead to the forecasting result severely deviating from the true demand value due to the high volatility existing in demand. To fill the gap, we propose an extended recurrent mixture density network (XRMDN), which extends the weight and mean neural networks to recurrent neural networks. The recurrent neurons for mean and variance can capture the trend of the historical data-series data, which enables a better forecasting result in dynamic and high volatility. We conduct comprehensive experiments on one taxi trip record and one bike-sharing real MoD data set to validate the performance of XRMDN. Specifically, we compare our model to three types of benchmark models, including statistical, machine learning, and deep learning models on three evaluation metrics. The validation results show that XRMDN outperforms the three groups of benchmark models in terms of the evaluation metrics. Most importantly, XRMDN substantially improves the forecasting accuracy with the demands in strong volatility. Last but not least, this probabilistic demand forecasting model contributes not only to the demand prediction in MoD systems but also to other optimization application problems, especially optimization under uncertainty, in MoD applications.
Abstract:Graph neural networks (GNNs) have shown prominent performance on attributed network embedding. However, existing efforts mainly focus on exploiting network structures, while the exploitation of node attributes is rather limited as they only serve as node features at the initial layer. This simple strategy impedes the potential of node attributes in augmenting node connections, leading to limited receptive field for inactive nodes with few or even no neighbors. Furthermore, the training objectives (i.e., reconstructing network structures) of most GNNs also do not include node attributes, although studies have shown that reconstructing node attributes is beneficial. Thus, it is encouraging to deeply involve node attributes in the key components of GNNs, including graph convolution operations and training objectives. However, this is a nontrivial task since an appropriate way of integration is required to maintain the merits of GNNs. To bridge the gap, in this paper, we propose COllaborative graph Neural Networks--CONN, a tailored GNN architecture for attribute network embedding. It improves model capacity by 1) selectively diffusing messages from neighboring nodes and involved attribute categories, and 2) jointly reconstructing node-to-node and node-to-attribute-category interactions via cross-correlation. Experiments on real-world networks demonstrate that CONN excels state-of-the-art embedding algorithms with a great margin.
Abstract:Recommender systems play a crucial role in helping users discover information that aligns with their interests based on their past behaviors. However, developing personalized recommendation systems becomes challenging when historical records of user-item interactions are unavailable, leading to what is known as the system cold-start recommendation problem. This issue is particularly prominent in start-up businesses or platforms with insufficient user engagement history. Previous studies focus on user or item cold-start scenarios, where systems could make recommendations for new users or items but are still trained with historical user-item interactions in the same domain, which cannot solve our problem. To bridge the gap, our research introduces an innovative and effective approach, capitalizing on the capabilities of pre-trained language models. We transform the recommendation process into sentiment analysis of natural languages containing information of user profiles and item attributes, where the sentiment polarity is predicted with prompt learning. By harnessing the extensive knowledge housed within language models, the prediction can be made without historical user-item interaction records. A benchmark is also introduced to evaluate the proposed method under the cold-start setting, and the results demonstrate the effectiveness of our method. To the best of our knowledge, this is the first study to tackle the system cold-start recommendation problem. The benchmark and implementation of the method are available at https://github.com/JacksonWuxs/PromptRec.
Abstract:Graph anomaly detection (GAD) is a vital task since even a few anomalies can pose huge threats to benign users. Recent semi-supervised GAD methods, which can effectively leverage the available labels as prior knowledge, have achieved superior performances than unsupervised methods. In practice, people usually need to identify anomalies on new (sub)graphs to secure their business, but they may lack labels to train an effective detection model. One natural idea is to directly adopt a trained GAD model to the new (sub)graph for testing. However, we find that existing semi-supervised GAD methods suffer from poor generalization issue, i.e., well-trained models could not perform well on an unseen area (i.e., not accessible in training) of the same graph. It may cause great troubles. In this paper, we base on the phenomenon and propose a general and novel research problem of generalized graph anomaly detection that aims to effectively identify anomalies on both the training-domain graph and unseen testing graph to eliminate potential dangers. Nevertheless, it is a challenging task since only limited labels are available, and the normal background may differ between training and testing data. Accordingly, we propose a data augmentation method named \textit{AugAN} (\uline{Aug}mentation for \uline{A}nomaly and \uline{N}ormal distributions) to enrich training data and boost the generalizability of GAD models. Experiments verify the effectiveness of our method in improving model generalizability.
Abstract:Accurate segmentation of large areas from very high spatial-resolution (VHR) remote sensing imagery remains a challenging issue in image analysis. Existing supervised and unsupervised methods both suffer from the large variance of object sizes and the difficulty in scale selection, which often result in poor segmentation accuracies. To address the above challenges, we propose a deep learning-based region-merging method (DeepMerge) to handle the segmentation in large VHR images by integrating a Transformer with a multi-level embedding module, a segment-based feature embedding module and a region-adjacency graph model. In addition, we propose a modified binary tree sampling method to generate multi-level inputs from initial segmentation results, serving as inputs for the DeepMerge model. To our best knowledge, the proposed method is the first to use deep learning to learn the similarity between adjacent segments for region-merging. The proposed DeepMerge method is validated using a remote sensing image of 0.55m resolution covering an area of 5,660 km^2 acquired from Google Earth. The experimental results show that the proposed DeepMerge with the highest F value (0.9446) and the lowest TE (0.0962) and ED2 (0.8989) is able to correctly segment objects of different sizes and outperforms all selected competing segmentation methods from both quantitative and qualitative assessments.