Sequential recommenders that are trained on implicit feedback are usually learned as a multi-class classification task through softmax-based loss functions on one-hot class labels. However, one-hot training labels are sparse and may lead to biased training and sub-optimal performance. Dense, soft labels have been shown to help improve recommendation performance. But how to generate high-quality and confident soft labels from noisy sequential interactions between users and items is still an open question. We propose a new learning framework for sequential recommenders, CSRec, which introduces confident soft labels to provide robust guidance when learning from user-item interactions. CSRec contains a teacher module that generates high-quality and confident soft labels and a student module that acts as the target recommender and is trained on the combination of dense, soft labels and sparse, one-hot labels. We propose and compare three approaches to constructing the teacher module: (i) model-level, (ii) data-level, and (iii) training-level. To evaluate the effectiveness and generalization ability of CSRec, we conduct experiments using various state-of-the-art sequential recommendation models as the target student module on four benchmark datasets. Our experimental results demonstrate that CSRec is effective in training better performing sequential recommenders.
This work proposes a safety-critical local reactive controller that enables the robot to navigate in unknown and cluttered environments. In particular, the trajectory tracking task is formulated as a constrained polynomial optimization problem. Then, safety constraints are imposed on the control variables invoking the notion of polynomial positivity certificates in conjunction with their Sum-of-Squares (SOS) approximation, thereby confining the robot motion inside the locally extracted convex free region. It is noteworthy that, in the process of devising the proposed safety constraints, the geometry of the robot can be approximated using any shape that can be characterized with a set of polynomial functions. The optimization problem is further convexified into a semidefinite program (SDP) leveraging truncated multi-sequences (tms) and moment relaxation, which favorably facilitates the effective use of off-the-shelf conic programming solvers, such that real-time performance is attainable. Various robot navigation tasks are investigated to demonstrate the effectiveness of the proposed approach in terms of safety and tracking performance.
We present MedShapeNet, a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D surgical instrument models. Prior to the deep learning era, the broad application of statistical shape models (SSMs) in medical image analysis is evidence that shapes have been commonly used to describe medical data. Nowadays, however, state-of-the-art (SOTA) deep learning algorithms in medical imaging are predominantly voxel-based. In computer vision, on the contrary, shapes (including, voxel occupancy grids, meshes, point clouds and implicit surface models) are preferred data representations in 3D, as seen from the numerous shape-related publications in premier vision conferences, such as the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), as well as the increasing popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915 models) in computer vision research. MedShapeNet is created as an alternative to these commonly used shape benchmarks to facilitate the translation of data-driven vision algorithms to medical applications, and it extends the opportunities to adapt SOTA vision algorithms to solve critical medical problems. Besides, the majority of the medical shapes in MedShapeNet are modeled directly on the imaging data of real patients, and therefore it complements well existing shape benchmarks comprising of computer-aided design (CAD) models. MedShapeNet currently includes more than 100,000 medical shapes, and provides annotations in the form of paired data. It is therefore also a freely available repository of 3D models for extended reality (virtual reality - VR, augmented reality - AR, mixed reality - MR) and medical 3D printing. This white paper describes in detail the motivations behind MedShapeNet, the shape acquisition procedures, the use cases, as well as the usage of the online shape search portal: https://medshapenet.ikim.nrw/
Multi-modal behaviors exhibited by surrounding vehicles (SVs) can typically lead to traffic congestion and reduce the travel efficiency of autonomous vehicles (AVs) in dense traffic. This paper proposes a real-time parallel trajectory optimization method for the AV to achieve high travel efficiency in dynamic and congested environments. A spatiotemporal safety module is developed to facilitate the safe interaction between the AV and SVs in the presence of trajectory prediction errors resulting from the multi-modal behaviors of the SVs. By leveraging multiple shooting and constraint transcription, we transform the trajectory optimization problem into a nonlinear programming problem, which allows for the use of optimization solvers and parallel computing techniques to generate multiple feasible trajectories in parallel. Subsequently, these spatiotemporal trajectories are fed into a multi-objective evaluation module considering both safety and efficiency objectives, such that the optimal feasible trajectory corresponding to the optimal target lane can be selected. The proposed framework is validated through simulations in a dense and congested driving scenario with multiple uncertain SVs. The results demonstrate that our method enables the AV to safely navigate through a dense and congested traffic scenario while achieving high travel efficiency and task accuracy in real time.
Unsignalized intersections are typically considered as one of the most representative and challenging scenarios for self-driving vehicles. To tackle autonomous driving problems in such scenarios, this paper proposes a curriculum proximal policy optimization (CPPO) framework with stage-decaying clipping. By adjusting the clipping parameter during different stages of training through proximal policy optimization (PPO), the vehicle can first rapidly search for an approximate optimal policy or its neighborhood with a large parameter, and then converges to the optimal policy with a small one. Particularly, the stage-based curriculum learning technology is incorporated into the proposed framework to improve the generalization performance and further accelerate the training process. Moreover, the reward function is specially designed in view of different curriculum settings. A series of comparative experiments are conducted in intersection-crossing scenarios with bi-lane carriageways to verify the effectiveness of the proposed CPPO method. The results show that the proposed approach demonstrates better adaptiveness to different dynamic and complex environments, as well as faster training speed over baseline methods.
In this work, we propose a novel learning-based online model predictive control (MPC) framework for motion synthesis of self-driving vehicles. In this framework, the decision variables are generated as instantaneous references to modulate the cost functions of online MPC, where the constraints of collision avoidance and drivable surface boundaries are latently represented in the soft form. Hence, the embodied maneuvers of the ego vehicle are empowered to adapt to complex and dynamic traffic environments, even with unmodeled uncertainties of other traffic participants. Furthermore, we implement a deep reinforcement learning (DRL) framework for policy search to cast the step actions as the decision variables, where the practical and lightweight observations are considered as the input features of the policy network. The proposed approach is implemented in the high-fidelity simulator involving compound-complex urban driving scenarios, and the results demonstrate that the proposed development manifests remarkable adaptiveness to complex and dynamic traffic environments with a success rate of 85%. Also, its advantages in terms of safety, maneuverability, and robustness are illustrated.
Augmenting large language models (LLMs) with external tools has emerged as a promising approach to extending the capability of LLMs. Although some works employ open-source LLMs for the tool learning task, most of them are trained in a controlled environment in which LLMs only learn to execute the human-provided tools. However, selecting proper tools from the large toolset is also a crucial ability for the tool learning model to be applied in real-world applications. Existing methods usually directly employ self-instruction methods to train the model, which ignores differences in tool complexity. In this paper, we propose the Confucius, a novel tool learning framework to train LLM to use complicated tools in real-world scenarios, which contains two main phases: (1) We first propose a multi-stage learning method to teach the LLM to use various tools from an easy-to-difficult curriculum; (2) thenceforth, we propose the Iterative Self-instruct from Introspective Feedback (ISIF) to dynamically construct the dataset to improve the ability to use the complicated tool. Extensive experiments conducted on both controlled and real-world settings demonstrate the superiority of our tool learning framework in the real-world application scenarios compared to both tuning-free (e.g. ChatGPT, Claude) and tuning-based baselines (e.g. GPT4Tools).
Model editing techniques modify a minor proportion of knowledge in Large Language Models (LLMs) at a relatively low cost, which have demonstrated notable success. Existing methods assume Transformer Layer (TL) hidden states are values of key-value memories of the Feed-Forward Network (FFN). They usually optimize the TL hidden states to memorize target knowledge and use it to update the weights of the FFN in LLMs. However, the information flow of TL hidden states comes from three parts: Multi-Head Self-Attention (MHSA), FFN, and residual connections. Existing methods neglect the fact that the TL hidden states contains information not specifically required for FFN. Consequently, the performance of model editing decreases. To achieve more precise model editing, we analyze hidden states of MHSA and FFN, finding that MHSA encodes certain general knowledge extraction patterns. This implies that MHSA weights do not require updating when new knowledge is introduced. Based on above findings, we introduce PMET, which simultaneously optimizes Transformer Component (TC, namely MHSA and FFN) hidden states, while only using the optimized TC hidden states of FFN to precisely update FFN weights. Our experiments demonstrate that PMET exhibits state-of-the-art performance on both the COUNTERFACT and zsRE datasets. Our ablation experiments substantiate the effectiveness of our enhancements, further reinforcing the finding that the MHSA encodes certain general knowledge extraction patterns and indicating its storage of a small amount of factual knowledge. Our code is available at https://github.com/xpq-tech/PMET.git.