Abstract:Since Internet is so popular and prevailing in human life, countering cyber threats, especially attack detection, is a challenging area of research in the field of cyber security. Intrusion detection systems (IDSs) are essential entities in a network topology aiming to safeguard the integrity and availability of sensitive assets in the protected systems. Although many supervised and unsupervised learning approaches from the field of machine learning and pattern recognition have been used to increase the efficacy of IDSs, it is still a problem to deal with lots of redundant and irrelevant features in high-dimension datasets for network anomaly detection. To this end, we propose a novel methodology combining the benefits of correlation-based feature selection(CFS) and bat algorithm(BA) with an ensemble classifier based on C4.5, Random Forest(RF), and Forest by Penalizing Attributes(Forest PA), which can be able to classify both common and rare types of attacks with high accuracy and efficiency. The experimental results, using a novel intrusion detection dataset, namely CIC-IDS2017, reveal that our CFS-BA-Ensemble method is able to contribute more critical features and significantly outperforms individual approaches, achieving high accuracy and low false alarm rate. Moreover, compared with the majority of the existing state-of-the-art and legacy techniques, our approach exhibits better performance under several classification metrics in the context of classification accuracy, f-measure, attack detection rate, and false alarm rate.