Abstract:This paper proposes an Improved Noisy Deep Q-Network (Noisy DQN) to enhance the exploration and stability of Unmanned Aerial Vehicle (UAV) when applying deep reinforcement learning in simulated environments. This method enhances the exploration ability by combining the residual NoisyLinear layer with an adaptive noise scheduling mechanism, while improving training stability through smooth loss and soft target network updates. Experiments show that the proposed model achieves faster convergence and up to $+40$ higher rewards compared to standard DQN and quickly reach to the minimum number of steps required for the task 28 in the 15 * 15 grid navigation environment set up. The results show that our comprehensive improvements to the network structure of NoisyNet, exploration control, and training stability contribute to enhancing the efficiency and reliability of deep Q-learning.
Abstract:Recommendation system services have become crucial for users to access personalized goods or services as the global e-commerce market expands. They can increase business sales growth and lower the cost of user information exploration. Recent years have seen a signifi-cant increase in researchers actively using user reviews to solve standard recommender system research issues. Reviews may, however, contain information that does not help consumers de-cide what to buy, such as advertising or fictitious or fake reviews. Using such reviews to offer suggestion services may reduce the effectiveness of those recommendations. In this research, the recommendation in e-commerce is developed using passer learning optimization based on Bi-LSTM to solve that issue (PL optimized Bi-LSTM). Data is first obtained from the product recommendation dataset and pre-processed to remove any values that are missing or incon-sistent. Then, feature extraction is performed using TF-IDF features and features that support graph embedding. Before submitting numerous features with the same dimensions to the Bi-LSTM classifier for analysis, they are integrated using the feature concatenation approach. The Collaborative Bi-LSTM method employs these features to determine if the model is a recommended product. The PL optimization approach, which efficiently adjusts the classifier's parameters and produces an extract output that measures the f1-score, MSE, precision, and recall, is the basis of this research's contributions. As compared to earlier methods, the pro-posed PL-optimized Bi-LSTM achieved values of 88.58%, 1.24%, 92.69%, and 92.69% for dataset 1, 88.46%, 0.48%, 92.43%, and 93.47% for dataset 2, and 92.51%, 1.58%, 91.90%, and 90.76% for dataset 3.