The utilization of long contexts poses a big challenge for large language models due to their limited context window length. Although the context window can be extended through fine-tuning, it will result in a considerable cost at both training and inference time, and exert an unfavorable impact to the LLM's original capabilities. In this work, we propose Activation Beacon, which condenses LLM's raw activations into more compact forms such that it can perceive a much longer context with a limited context window. Activation Beacon is introduced as a plug-and-play module for the LLM. It fully preserves the LLM's original capability on short contexts while extending the new capability on processing longer contexts. Besides, it works with short sliding windows to process the long context, which achieves a competitive memory and time efficiency in both training and inference. Activation Beacon is learned by the auto-regression task conditioned on a mixture of beacons with diversified condensing ratios. Thanks to such a treatment, it can be efficiently trained purely with short-sequence data in just 10K steps, which consumes less than 9 hours on a single 8xA800 GPU machine. The experimental studies show that Activation Beacon is able to extend Llama-2-7B's context length by $\times100$ times (from 4K to 400K), meanwhile achieving a superior result on both long-context generation and understanding tasks. Our model and code will be available at the BGE repository.
Fixed-wing aerial vehicles provide an efficient way to navigate long distances or cover large areas for environmental monitoring applications. By design, they also require large open spaces due to limited maneuverability. However, strict regulatory and safety altitude limits constrain the available space. Especially in complex, confined, or steep terrain, ensuring the vehicle does not enter an inevitable collision state(ICS) can be challenging. In this work, we propose a strategy to find safe paths that do not enter an ICS while navigating within tight altitude constraints. The method uses periodic paths to efficiently classify ICSs. A sampling-based planner creates collision-free and kinematically feasible paths that begin and end in safe periodic (circular) paths. We show that, in realistic terrain, using circular periodic paths can simplify the goal selection process by making it yaw agnostic and constraining yaw. We demonstrate our approach by dynamically planning safe paths in real-time while navigating steep terrain on a flight test in complex alpine terrain.
This paper describes an architecture for predicting the price of cryptocurrencies for the next seven days using the Adaptive Network Based Fuzzy Inference System (ANFIS). Historical data of cryptocurrencies and indexes that are considered are Bitcoin (BTC), Ethereum (ETH), Bitcoin Dominance (BTC.D), and Ethereum Dominance (ETH.D) in a daily timeframe. The methods used to teach the data are hybrid and backpropagation algorithms, as well as grid partition, subtractive clustering, and Fuzzy C-means clustering (FCM) algorithms, which are used in data clustering. The architectural performance designed in this paper has been compared with different inputs and neural network models in terms of statistical evaluation criteria. Finally, the proposed method can predict the price of digital currencies in a short time.
This paper focuses on the classification task of breast ultrasound images and researches on the reliability measurement of classification results. We proposed a dual-channel evaluation framework based on the proposed inference reliability and predictive reliability scores. For the inference reliability evaluation, human-aligned and doctor-agreed inference rationales based on the improved feature attribution algorithm SP-RISA are gracefully applied. Uncertainty quantification is used to evaluate the predictive reliability via the Test Time Enhancement. The effectiveness of this reliability evaluation framework has been verified on our breast ultrasound clinical dataset YBUS, and its robustness is verified on the public dataset BUSI. The expected calibration errors on both datasets are significantly lower than traditional evaluation methods, which proves the effectiveness of our proposed reliability measurement.
Understanding the multifaceted effects of climate change across diverse geographic locations is crucial for timely adaptation and the development of effective mitigation strategies. As the volume of scientific literature on this topic continues to grow exponentially, manually reviewing these documents has become an immensely challenging task. Utilizing Natural Language Processing (NLP) techniques to analyze this wealth of information presents an efficient and scalable solution. By gathering extensive amounts of peer-reviewed articles and studies, we can extract and process critical information about the effects of climate change in specific regions. We employ BERT (Bidirectional Encoder Representations from Transformers) for Named Entity Recognition (NER), which enables us to efficiently identify specific geographies within the climate literature. This, in turn, facilitates location-specific analyses. We conduct region-specific climate trend analyses to pinpoint the predominant themes or concerns related to climate change within a particular area, trace the temporal progression of these identified issues, and evaluate their frequency, severity, and potential development over time. These in-depth examinations of location-specific climate data enable the creation of more customized policy-making, adaptation, and mitigation strategies, addressing each region's unique challenges and providing more effective solutions rooted in data-driven insights. This approach, founded on a thorough exploration of scientific texts, offers actionable insights to a wide range of stakeholders, from policymakers to engineers to environmentalists. By proactively understanding these impacts, societies are better positioned to prepare, allocate resources wisely, and design tailored strategies to cope with future climate conditions, ensuring a more resilient future for all.
Real-time and accurate information on fine-grained changes in crop cultivation is of great significance for crop growth monitoring, yield prediction and agricultural structure adjustment. Aiming at the problems of serious spectral confusion in visible high-resolution unmanned aerial vehicle (UAV) images of different phases, interference of large complex background and salt-and-pepper noise by existing semantic change detection (SCD) algorithms, in order to effectively extract deep image features of crops and meet the demand of agricultural practical engineering applications, this paper designs and proposes an agricultural geographic scene and parcel-scale constrained SCD framework for crops (AGSPNet). AGSPNet framework contains three parts: agricultural geographic scene (AGS) division module, parcel edge extraction module and crop SCD module. Meanwhile, we produce and introduce an UAV image SCD dataset (CSCD) dedicated to agricultural monitoring, encompassing multiple semantic variation types of crops in complex geographical scene. We conduct comparative experiments and accuracy evaluations in two test areas of this dataset, and the results show that the crop SCD results of AGSPNet consistently outperform other deep learning SCD models in terms of quantity and quality, with the evaluation metrics F1-score, kappa, OA, and mIoU obtaining improvements of 0.038, 0.021, 0.011 and 0.062, respectively, on average over the sub-optimal method. The method proposed in this paper can clearly detect the fine-grained change information of crop types in complex scenes, which can provide scientific and technical support for smart agriculture monitoring and management, food policy formulation and food security assurance.
Accurate and efficient extraction of microstructures in microscopic images of materials plays a critical role in the exploration of structure-property relationships and the optimization of process parameters. Deep learning-based image segmentation techniques that rely on manual annotation are time-consuming and labor-intensive and hardly meet the demand for model transferability and generalization. Segment Anything Model (SAM), a large visual model with powerful deep feature representation and zero-shot generalization capabilities, has provided new solutions for image segmentation. However, directly applying SAM to segmenting microstructures in microscopic images of materials without human annotation cannot achieve the expected results, as the difficulty of adapting its native prompt engineering to the dense and dispersed characteristics of key microstructures in materials microscopy images. In this paper, we propose MatSAM, a general and efficient microstructure extraction solution based on SAM. A new point-based prompts generation strategy is designed, grounded on the distribution and shape of materials microstructures. It generates prompts for different microscopic images, fuses the prompts of the region of interest (ROI) key points and grid key points, and integrates post-processing methods for quantitative characterization of materials microstructures. For common microstructures including grain boundary and phase, MatSAM achieves superior segmentation performance to conventional methods and is even preferable to supervised learning methods evaluated on 18 materials microstructures imaged by the optical microscope (OM) and scanning electron microscope (SEM). We believe that MatSAM can significantly reduce the cost of quantitative characterization of materials microstructures and accelerate the design of new materials.
Diversity schemes play a vital role in improving the performance of ultra-reliable communication systems by transmitting over two or more communication channels to combat fading and co-channel interference. Determining an appropriate transmission strategy that satisfies ultra-reliability constraint necessitates derivation of statistics of channel in ultra-reliable region and, subsequently, integration of these statistics into rate selection while incorporating a confidence interval to account for potential uncertainties that may arise during estimation. In this paper, we propose a novel framework for ultra-reliable real-time transmission considering both spatial diversities and ultra-reliable channel statistics based on multivariate extreme value theory. First, tail distribution of joint received power sequences obtained from different receivers is modeled while incorporating inter-relations of extreme events occurring rarely based on Poisson point process approach in MEVT. The optimum transmission strategies are then developed by determining optimum transmission rate based on estimated joint tail distribution and incorporating confidence intervals into estimations to cope with the availability of limited data. Finally, system reliability is assessed by utilizing outage probability metric. Through analysis of data obtained from the engine compartment of Fiat Linea, our study showcases the effectiveness of proposed methodology in surpassing traditional extrapolation-based approaches. This innovative method not only achieves a higher transmission rate, but also effectively addresses stringent requirements of ultra-reliability. The findings indicate that proposed rate selection framework offers a viable solution for achieving a desired target error probability by employing a higher transmission rate and reducing the amount of training data compared to conventional rate selection methods.
The emergence of on-demand ride pooling services allows each vehicle to serve multiple passengers at a time, thus increasing drivers' income and enabling passengers to travel at lower prices than taxi/car on-demand services (only one passenger can be assigned to a car at a time like UberX and Lyft). Although on-demand ride pooling services can bring so many benefits, ride pooling services need a well-defined matching strategy to maximize the benefits for all parties (passengers, drivers, aggregation companies and environment), in which the regional dispatching of vehicles has a significant impact on the matching and revenue. Existing algorithms often only consider revenue maximization, which makes it difficult for requests with unusual distribution to get a ride. How to increase revenue while ensuring a reasonable assignment of requests brings a challenge to ride pooling service companies (aggregation companies). In this paper, we propose a framework for vehicle dispatching for ride pooling tasks, which splits the city into discrete dispatching regions and uses the reinforcement learning (RL) algorithm to dispatch vehicles in these regions. We also consider the mutual information (MI) between vehicle and order distribution as the intrinsic reward of the RL algorithm to improve the correlation between their distributions, thus ensuring the possibility of getting a ride for unusually distributed requests. In experimental results on a real-world taxi dataset, we demonstrate that our framework can significantly increase revenue up to an average of 3\% over the existing best on-demand ride pooling method.
We present CineMPC, a complete cinematographic system that autonomously controls a drone to film multiple targets recording user-specified aesthetic objectives. Existing solutions in autonomous cinematography control only the camera extrinsics, namely its position, and orientation. In contrast, CineMPC is the first solution that includes the camera intrinsic parameters in the control loop, which are essential tools for controlling cinematographic effects like focus, depth-of-field, and zoom. The system estimates the relative poses between the targets and the camera from an RGB-D image and optimizes a trajectory for the extrinsic and intrinsic camera parameters to film the artistic and technical requirements specified by the user. The drone and the camera are controlled in a nonlinear Model Predicted Control (MPC) loop by re-optimizing the trajectory at each time step in response to current conditions in the scene. The perception system of CineMPC can track the targets' position and orientation despite the camera effects. Experiments in a photorealistic simulation and with a real platform demonstrate the capabilities of the system to achieve a full array of cinematographic effects that are not possible without the control of the intrinsics of the camera. Code for CineMPC is implemented following a modular architecture in ROS and released to the community.