Abstract:Automated Audio Captioning (AAC) generates captions for audio clips but faces challenges due to limited datasets compared to image captioning. To overcome this, we propose the zero-shot AAC system that leverages pre-trained models, eliminating the need for extensive training. Our approach uses a pre-trained audio CLIP model to extract auditory features and generate a structured prompt, which guides a Large Language Model (LLM) in caption generation. Unlike traditional greedy decoding, our method refines token selection through the audio CLIP model, ensuring alignment with the audio content. Experimental results demonstrate a 35% improvement in NLG mean score (from 4.7 to 7.3) using MAGIC search with the WavCaps model. The performance is heavily influenced by the audio-text matching model and keyword selection, with optimal results achieved using a single keyword prompt, and a 50% performance drop when no keyword list is used.
Abstract:Battery degradation is a major challenge in electric vehicles (EV) and energy storage systems (ESS). However, most degradation investigations focus mainly on estimating the state of charge (SOC), which fails to accurately interpret the cells' internal degradation mechanisms. Differential capacity analysis (DCA) focuses on the rate of change of cell voltage about the change in cell capacity, under various charge/discharge rates. This paper developed a battery cell degradation testing model that used two types of lithium-ions (Li-ion) battery cells, namely lithium nickel cobalt aluminium oxides (LiNiCoAlO2) and lithium iron phosphate (LiFePO4), to evaluate internal degradation during loading conditions. The proposed battery degradation model contains distinct charge rates (DCR) of 0.2C, 0.5C, 1C, and 1.5C, as well as discharge rates (DDR) of 0.5C, 0.9C, 1.3C, and 1.6C to analyze the internal health and performance of battery cells during slow, moderate, and fast loading conditions. Besides, this research proposed a model that incorporates the Extended Kalman Filter (EKF), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM) networks to validate experimental data. The proposed model yields excellent modelling results based on mean squared error (MSE), and root mean squared error (RMSE), with errors of less than 0.001% at DCR and DDR. The peak identification technique (PIM) has been utilized to investigate battery health based on the number of peaks, peak position, peak height, peak area, and peak width. At last, the PIM method has discovered that the cell aged gradually under normal loading rates but deteriorated rapidly under fast loading conditions. Overall, LiFePO4 batteries perform more robustly and consistently than (LiNiCoAlO2) cells under varying loading conditions.