The proliferation of fake reviews on various online platforms has created a major concern for both consumers and businesses. Such reviews can deceive customers and cause damage to the reputation of products or services, making it crucial to identify them. Although the detection of fake reviews has been extensively studied in English language, detecting fake reviews in non-English languages such as Bengali is still a relatively unexplored research area. This paper introduces the Bengali Fake Review Detection (BFRD) dataset, the first publicly available dataset for identifying fake reviews in Bengali. The dataset consists of 7710 non-fake and 1339 fake food-related reviews collected from social media posts. To convert non-Bengali words in a review, a unique pipeline has been proposed that translates English words to their corresponding Bengali meaning and also back transliterates Romanized Bengali to Bengali. We have conducted rigorous experimentation using multiple deep learning and pre-trained transformer language models to develop a reliable detection system. Finally, we propose a weighted ensemble model that combines four pre-trained transformers: BanglaBERT, BanglaBERT Base, BanglaBERT Large, and BanglaBERT Generator . According to the experiment results, the proposed ensemble model obtained a weighted F1-score of 0.9843 on 13390 reviews, including 1339 actual fake reviews and 5356 augmented fake reviews generated with the nlpaug library. The remaining 6695 reviews were randomly selected from the 7710 non-fake instances. The model achieved a 0.9558 weighted F1-score when the fake reviews were augmented using the bnaug library.
This paper investigates the potential of semi-supervised Generative Adversarial Networks (GANs) to fine-tune pretrained language models in order to classify Bengali fake reviews from real reviews with a few annotated data. With the rise of social media and e-commerce, the ability to detect fake or deceptive reviews is becoming increasingly important in order to protect consumers from being misled by false information. Any machine learning model will have trouble identifying a fake review, especially for a low resource language like Bengali. We have demonstrated that the proposed semi-supervised GAN-LM architecture (generative adversarial network on top of a pretrained language model) is a viable solution in classifying Bengali fake reviews as the experimental results suggest that even with only 1024 annotated samples, BanglaBERT with semi-supervised GAN (SSGAN) achieved an accuracy of 83.59% and a f1-score of 84.89% outperforming other pretrained language models - BanglaBERT generator, Bangla BERT Base and Bangla-Electra by almost 3%, 4% and 10% respectively in terms of accuracy. The experiments were conducted on a manually labeled food review dataset consisting of total 6014 real and fake reviews collected from various social media groups. Researchers that are experiencing difficulty recognizing not just fake reviews but other classification issues owing to a lack of labeled data may find a solution in our proposed methodology.
Crime is an unlawful act that carries legal repercussions. Bangladesh has a high crime rate due to poverty, population growth, and many other socio-economic issues. For law enforcement agencies, understanding crime patterns is essential for preventing future criminal activity. For this purpose, these agencies need structured crime database. This paper introduces a novel crime dataset that contains temporal, geographic, weather, and demographic data about 6574 crime incidents of Bangladesh. We manually gather crime news articles of a seven year time span from a daily newspaper archive. We extract basic features from these raw text. Using these basic features, we then consult standard service-providers of geo-location and weather data in order to garner these information related to the collected crime incidents. Furthermore, we collect demographic information from Bangladesh National Census data. All these information are combined that results in a standard machine learning dataset. Together, 36 features are engineered for the crime prediction task. Five supervised machine learning classification algorithms are then evaluated on this newly built dataset and satisfactory results are achieved. We also conduct exploratory analysis on various aspects the dataset. This dataset is expected to serve as the foundation for crime incidence prediction systems for Bangladesh and other countries. The findings of this study will help law enforcement agencies to forecast and contain crime as well as to ensure optimal resource allocation for crime patrol and prevention.
Numerous meta-heuristic algorithms have been influenced by nature. Over the past couple of decades, their quantity has been significantly escalating. The majority of these algorithms attempt to emulate natural biological and physical phenomena. This research concentrates on the Flower Pollination algorithm, which is one of several bio-inspired algorithms. The original approach was suggested for pollen grain exploration and exploitation in confined space using a specific global pollination and local pollination strategy. As a "swarm intelligence" meta-heuristic algorithm, its strength lies in locating the vicinity of the optimum solution rather than identifying the minimum. A modification to the original method is detailed in this work. This research found that by changing the specific value of "switch probability" with dynamic values of different dimension sizes and functions, the outcome was mainly improved over the original flower pollination method.
In modern capital market the price of a stock is often considered to be highly volatile and unpredictable because of various social, financial, political and other dynamic factors. With calculated and thoughtful investment, stock market can ensure a handsome profit with minimal capital investment, while incorrect prediction can easily bring catastrophic financial loss to the investors. This paper introduces the application of a recently introduced machine learning model - the Transformer model, to predict the future price of stocks of Dhaka Stock Exchange (DSE), the leading stock exchange in Bangladesh. The transformer model has been widely leveraged for natural language processing and computer vision tasks, but, to the best of our knowledge, has never been used for stock price prediction task at DSE. Recently the introduction of time2vec encoding to represent the time series features has made it possible to employ the transformer model for the stock price prediction. This paper concentrates on the application of transformer-based model to predict the price movement of eight specific stocks listed in DSE based on their historical daily and weekly data. Our experiments demonstrate promising results and acceptable root mean squared error on most of the stocks.