Abstract:The exponential expansion of digital commerce in Indonesia has significantly shifted consumer interactions toward video-centric social networks, particularly YouTube. Consequently, the sheer volume of unstructured, multi-contextual comments poses a tremendous challenge for manual sentiment tracking. This study investigates and constructs a predictive model for customer satisfaction leveraging the Extreme Gradient Boosting (XGBoost) architecture coupled with Term Frequency-Inverse Document Frequency (TF-IDF) vectorization. By utilizing a secondary dataset of YouTube comments retrieved from e-commerce review videos, the raw text underwent rigorous preprocessing to generate normalized numerical features. The experimental results demonstrate that the PyCaret-optimized machine learning framework delivers superior classification resilience. Beyond standard performance metrics, lexical evaluations and feature-importance mapping uncover a notable phenomenon: e-commerce discourse is heavily infiltrated by socio-political terminologies, which ultimately influence the polarity of audience satisfaction.
Abstract:This paper compares a PyCaret AutoML branch and a CNN-BiLSTM branch for binary hate speech detection on Indonesian Twitter using the HS label from the corpus of Ibrohim and Budi. Both branches share the same preprocessing pipeline so that the comparison reflects modelling differences rather than inconsistent data preparation. The conventional branch uses TF-IDF with a lexicon-based abusive-word count, whereas the neural branch learns dense token representations and captures both local phrase patterns and bidirectional context. The benchmark is built from the released 13,130-row annotation table, whose HS label yields a 58:42 class ratio. On the held-out split, CNN-BiLSTM achieves the best result with 83.8% accuracy, 79.8% precision, 82.7% recall, and 81.2% F1-score. Within the PyCaret branch, Random Forest is the strongest conventional model with 77.2% accuracy and 77.0% F1-score. The neural branch therefore improves accuracy by 6.6 points and F1-score by 4.2 points. Exploratory corpus analysis, learning curves, and confusion matrices show that the dataset is short-text, moderately imbalanced, and still difficult because many decisions depend on local lexical cues plus short contextual composition. The study concludes that PyCaret AutoML is an effective conventional benchmarking framework, whereas CNN-BiLSTM is the stronger end model for the reported benchmark setting.




Abstract:Respiratory rate is a vital sign indicating various health conditions. Traditional contact-based measurement methods are often uncomfortable, and alternatives like respiratory belts and smartwatches have limitations in cost and operability. Therefore, a non-contact method based on Pixel Intensity Changes (PIC) with RGB camera images is proposed. Experiments involved 3 sizes of bounding boxes, 3 filter options (Laplacian, Sobel, and no filter), and 2 corner detection algorithms (ShiTomasi and Harris), with tracking using the Lukas-Kanade algorithm. Eighteen configurations were tested on 67 subjects in static and dynamic conditions. The best results in static conditions were achieved with the Medium Bounding box, Sobel Filter, and Harris Method (MAE: 0.85, RMSE: 1.49). In dynamic conditions, the Large Bounding box with no filter and ShiTomasi, and Medium Bounding box with no filter and Harris, produced the lowest MAE (0.81) and RMSE (1.35)