In this paper, we introduce a new dataset in the medical field of hypertensive intracerebral hemorrhage (HICH), called HICH-IT, which includes both electronic medical records (EMRs) and head CT images. This dataset is designed to enhance the accuracy of artificial intelligence in the diagnosis and treatment of HICH. This dataset, built upon the foundation of standard text and image data, incorporates specific annotations within the EMRs, extracting key content from the text information, and categorizes the annotation content of imaging data into four types: brain midline, hematoma, left and right cerebral ventricle. HICH-IT aims to be a foundational dataset for feature learning in image segmentation tasks and named entity recognition. To further understand the dataset, we have trained deep learning algorithms to observe the performance. The pretrained models have been released at both www.daip.club and github.com/Deep-AI-Application-DAIP. The dataset has been uploaded to https://github.com/CYBUS123456/HICH-IT-Datasets. Index Terms-HICH, Deep learning, Intraparenchymal hemorrhage, named entity recognition, novel dataset
The fabric-based pneumatic exosuit is now a hot research topic because it is lighter and softer than traditional exoskeletons. Existing research focused more on the mechanical properties of the exosuit (e.g., torque and speed), but less on its wearability (e.g., appearance and comfort). This work presents a new design concept for fabric-based pneumatic exosuits Volume Transfer, which means transferring the volume of pneumatic actuators beyond the garments profile to the inside. This allows for a concealed appearance and a larger stress area while maintaining adequate torques. In order to verify this concept, we develop a fabric-based pneumatic exosuit for knee extension assistance. Its profile is only 26mm and its stress area wraps around almost half of the leg. We use a mathematical model and simulation to determine the parameters of the exosuit, avoiding multiple iterations of the prototype. Experiment results show that the exosuit can generate a torque of 7.6Nm at a pressure of 90kPa and produce a significant reduction in the electromyography activity of the knee extensor muscles. We believe that Volume Transfer could be utilized prevalently in future fabric-based pneumatic exosuit designs to achieve a significant improvement in wearability.
Image forensics has become increasingly important in our daily lives. As a fundamental type of forgeries, Copy-Move Forgery Detection (CMFD) has received significant attention in the academic community. Keypoint-based algorithms, particularly those based on SIFT, have achieved good results in CMFD. However, the most of keypoint detection algorithms often fail to generate sufficient matches when tampered patches are present in smooth areas. To tackle this problem, we introduce entropy images to determine the coordinates and scales of keypoints, resulting significantly increasing the number of keypoints. Furthermore, we develop an entropy level clustering algorithm to avoid increased matching complexity caused by non-ideal distribution of grayscale values in keypoints. Experimental results demonstrate that our algorithm achieves a good balance between performance and time efficiency.
This paper is not motivated to seek innovation within the attention mechanism. Instead, it focuses on overcoming the existing trade-offs between accuracy and efficiency within the context of point cloud processing, leveraging the power of scale. Drawing inspiration from recent advances in 3D large-scale representation learning, we recognize that model performance is more influenced by scale than by intricate design. Therefore, we present Point Transformer V3 (PTv3), which prioritizes simplicity and efficiency over the accuracy of certain mechanisms that are minor to the overall performance after scaling, such as replacing the precise neighbor search by KNN with an efficient serialized neighbor mapping of point clouds organized with specific patterns. This principle enables significant scaling, expanding the receptive field from 16 to 1024 points while remaining efficient (a 3x increase in processing speed and a 10x improvement in memory efficiency compared with its predecessor, PTv2). PTv3 attains state-of-the-art results on over 20 downstream tasks that span both indoor and outdoor scenarios. Further enhanced with multi-dataset joint training, PTv3 pushes these results to a higher level.
This technical report presents AutoGen, a new framework that enables development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen agents are customizable, conversable, and seamlessly allow human participation. They can operate in various modes that employ combinations of LLMs, human inputs, and tools. AutoGen's design offers multiple advantages: a) it gracefully navigates the strong but imperfect generation and reasoning abilities of these LLMs; b) it leverages human understanding and intelligence, while providing valuable automation through conversations between agents; c) it simplifies and unifies the implementation of complex LLM workflows as automated agent chats. We provide many diverse examples of how developers can easily use AutoGen to effectively solve tasks or build applications, ranging from coding, mathematics, operations research, entertainment, online decision-making, question answering, etc.
Visual feedback plays a crucial role in the process of amputation patients completing grasping in the field of prosthesis control. However, for blind and visually impaired (BVI) amputees, the loss of both visual and grasping abilities makes the "easy" reach-and-grasp task a feasible challenge. In this paper, we propose a novel multi-sensory prosthesis system helping BVI amputees with sensing, navigation and grasp operations. It combines modules of voice interaction, environmental perception, grasp guidance, collaborative control, and auditory/tactile feedback. In particular, the voice interaction module receives user instructions and invokes other functional modules according to the instructions. The environmental perception and grasp guidance module obtains environmental information through computer vision, and feedbacks the information to the user through auditory feedback modules (voice prompts and spatial sound sources) and tactile feedback modules (vibration stimulation). The prosthesis collaborative control module obtains the context information of the grasp guidance process and completes the collaborative control of grasp gestures and wrist angles of prosthesis in conjunction with the user's control intention in order to achieve stable grasp of various objects. This paper details a prototyping design (named viia-hand) and presents its preliminary experimental verification on healthy subjects completing specific reach-and-grasp tasks. Our results showed that, with the help of our new design, the subjects were able to achieve a precise reach and reliable grasp of the target objects in a relatively cluttered environment. Additionally, the system is extremely user-friendly, as users can quickly adapt to it with minimal training.
The prevalent use of benchmarks in current offline reinforcement learning (RL) research has led to a neglect of the imbalance of real-world dataset distributions in the development of models. The real-world offline RL dataset is often imbalanced over the state space due to the challenge of exploration or safety considerations. In this paper, we specify properties of imbalanced datasets in offline RL, where the state coverage follows a power law distribution characterized by skewed policies. Theoretically and empirically, we show that typically offline RL methods based on distributional constraints, such as conservative Q-learning (CQL), are ineffective in extracting policies under the imbalanced dataset. Inspired by natural intelligence, we propose a novel offline RL method that utilizes the augmentation of CQL with a retrieval process to recall past related experiences, effectively alleviating the challenges posed by imbalanced datasets. We evaluate our method on several tasks in the context of imbalanced datasets with varying levels of imbalance, utilizing the variant of D4RL. Empirical results demonstrate the superiority of our method over other baselines.
* ICML 2023, workshop on Data-centric Machine Learning Research
Motion prediction is crucial for autonomous driving systems to understand complex driving scenarios and make informed decisions. However, this task is challenging due to the diverse behaviors of traffic participants and complex environmental contexts. In this paper, we propose Motion TRansformer (MTR) frameworks to address these challenges. The initial MTR framework utilizes a transformer encoder-decoder structure with learnable intention queries, enabling efficient and accurate prediction of future trajectories. By customizing intention queries for distinct motion modalities, MTR improves multimodal motion prediction while reducing reliance on dense goal candidates. The framework comprises two essential processes: global intention localization, identifying the agent's intent to enhance overall efficiency, and local movement refinement, adaptively refining predicted trajectories for improved accuracy. Moreover, we introduce an advanced MTR++ framework, extending the capability of MTR to simultaneously predict multimodal motion for multiple agents. MTR++ incorporates symmetric context modeling and mutually-guided intention querying modules to facilitate future behavior interaction among multiple agents, resulting in scene-compliant future trajectories. Extensive experimental results demonstrate that the MTR framework achieves state-of-the-art performance on the highly-competitive motion prediction benchmarks, while the MTR++ framework surpasses its precursor, exhibiting enhanced performance and efficiency in predicting accurate multimodal future trajectories for multiple agents.
* The winning approaches for the Waymo Motion Prediction Challenge in
2022 and 2023