Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Armstrong Aboah

Demographics-Informed Neural Network for Multi-Modal Spatiotemporal forecasting of Urban Growth and Travel Patterns Using Satellite Imagery

Jun 14, 2025

Eugene Kofi Okrah Denteh, Andrews Danyo, Joshua Kofi Asamoah, Blessing Agyei Kyem, Armstrong Aboah

Abstract:This study presents a novel demographics informed deep learning framework designed to forecast urban spatial transformations by jointly modeling geographic satellite imagery, socio-demographics, and travel behavior dynamics. The proposed model employs an encoder-decoder architecture with temporal gated residual connections, integrating satellite imagery and demographic data to accurately forecast future spatial transformations. The study also introduces a demographics prediction component which ensures that predicted satellite imagery are consistent with demographic features, significantly enhancing physiological realism and socioeconomic accuracy. The framework is enhanced by a proposed multi-objective loss function complemented by a semantic loss function that balances visual realism with temporal coherence. The experimental results from this study demonstrate the superior performance of the proposed model compared to state-of-the-art models, achieving higher structural similarity (SSIM: 0.8342) and significantly improved demographic consistency (Demo-loss: 0.14 versus 0.95 and 0.96 for baseline models). Additionally, the study validates co-evolutionary theories of urban development, demonstrating quantifiable bidirectional influences between built environment characteristics and population patterns. The study also contributes a comprehensive multimodal dataset pairing satellite imagery sequences (2012-2023) with corresponding demographic and travel behavior attributes, addressing existing gaps in urban and transportation planning resources by explicitly connecting physical landscape evolution with socio-demographic patterns.

Via

Access Paper or Ask Questions

Visual Dominance and Emerging Multimodal Approaches in Distracted Driving Detection: A Review of Machine Learning Techniques

May 04, 2025

Anthony Dontoh, Stephanie Ivey, Logan Sirbaugh, Andrews Danyo, Armstrong Aboah

Abstract:Distracted driving continues to be a significant cause of road traffic injuries and fatalities worldwide, even with advancements in driver monitoring technologies. Recent developments in machine learning (ML) and deep learning (DL) have primarily focused on visual data to detect distraction, often neglecting the complex, multimodal nature of driver behavior. This systematic review assesses 74 peer-reviewed studies from 2019 to 2024 that utilize ML/DL techniques for distracted driving detection across visual, sensor-based, multimodal, and emerging modalities. The review highlights a significant prevalence of visual-only models, particularly convolutional neural networks (CNNs) and temporal architectures, which achieve high accuracy but show limited generalizability in real-world scenarios. Sensor-based and physiological models provide complementary strengths by capturing internal states and vehicle dynamics, while emerging techniques, such as auditory sensing and radio frequency (RF) methods, offer privacy-aware alternatives. Multimodal architecture consistently surpasses unimodal baselines, demonstrating enhanced robustness, context awareness, and scalability by integrating diverse data streams. These findings emphasize the need to move beyond visual-only approaches and adopt multimodal systems that combine visual, physiological, and vehicular cues while keeping in checking the need to balance computational requirements. Future research should focus on developing lightweight, deployable multimodal frameworks, incorporating personalized baselines, and establishing cross-modality benchmarks to ensure real-world reliability in advanced driver assistance systems (ADAS) and road safety interventions.

Via

Access Paper or Ask Questions

An Improved ResNet50 Model for Predicting Pavement Condition Index (PCI) Directly from Pavement Images

Apr 25, 2025

Andrews Danyo, Anthony Dontoh, Armstrong Aboah

Abstract:Accurately predicting the Pavement Condition Index (PCI), a measure of roadway conditions, from pavement images is crucial for infrastructure maintenance. This study proposes an enhanced version of the Residual Network (ResNet50) architecture, integrated with a Convolutional Block Attention Module (CBAM), to predict PCI directly from pavement images without additional annotations. By incorporating CBAM, the model autonomously prioritizes critical features within the images, improving prediction accuracy. Compared to the original baseline ResNet50 and DenseNet161 architectures, the enhanced ResNet50-CBAM model achieved a significantly lower mean absolute percentage error (MAPE) of 58.16%, compared to the baseline models that achieved 70.76% and 65.48% respectively. These results highlight the potential of using attention mechanisms to refine feature extraction, ultimately enabling more accurate and efficient assessments of pavement conditions. This study emphasizes the importance of targeted feature refinement in advancing automated pavement analysis through attention mechanisms.

Via

Access Paper or Ask Questions

Integrating Travel Behavior Forecasting and Generative Modeling for Predicting Future Urban Mobility and Spatial Transformations

Mar 27, 2025

Eugene Denteh, Andrews Danyo, Joshua Kofi Asamoah, Blessing Agyei Kyem, Twitchell Addai, Armstrong Aboah

Figure 1 for Integrating Travel Behavior Forecasting and Generative Modeling for Predicting Future Urban Mobility and Spatial Transformations

Figure 2 for Integrating Travel Behavior Forecasting and Generative Modeling for Predicting Future Urban Mobility and Spatial Transformations

Figure 3 for Integrating Travel Behavior Forecasting and Generative Modeling for Predicting Future Urban Mobility and Spatial Transformations

Figure 4 for Integrating Travel Behavior Forecasting and Generative Modeling for Predicting Future Urban Mobility and Spatial Transformations

Abstract:Transportation planning plays a critical role in shaping urban development, economic mobility, and infrastructure sustainability. However, traditional planning methods often struggle to accurately predict long-term urban growth and transportation demands. This may sometimes result in infrastructure demolition to make room for current transportation planning demands. This study integrates a Temporal Fusion Transformer to predict travel patterns from demographic data with a Generative Adversarial Network to predict future urban settings through satellite imagery. The framework achieved a 0.76 R-square score in travel behavior prediction and generated high-fidelity satellite images with a Structural Similarity Index of 0.81. The results demonstrate that integrating predictive analytics and spatial visualization can significantly improve the decision-making process, fostering more sustainable and efficient urban development. This research highlights the importance of data-driven methodologies in modern transportation planning and presents a step toward optimizing infrastructure placement, capacity, and long-term viability.

Via

Access Paper or Ask Questions

Context-CrackNet: A Context-Aware Framework for Precise Segmentation of Tiny Cracks in Pavement images

Jan 24, 2025

Blessing Agyei Kyem, Joshua Kofi Asamoah, Armstrong Aboah

Figure 1 for Context-CrackNet: A Context-Aware Framework for Precise Segmentation of Tiny Cracks in Pavement images

Figure 2 for Context-CrackNet: A Context-Aware Framework for Precise Segmentation of Tiny Cracks in Pavement images

Figure 3 for Context-CrackNet: A Context-Aware Framework for Precise Segmentation of Tiny Cracks in Pavement images

Figure 4 for Context-CrackNet: A Context-Aware Framework for Precise Segmentation of Tiny Cracks in Pavement images

Abstract:The accurate detection and segmentation of pavement distresses, particularly tiny and small cracks, are critical for early intervention and preventive maintenance in transportation infrastructure. Traditional manual inspection methods are labor-intensive and inconsistent, while existing deep learning models struggle with fine-grained segmentation and computational efficiency. To address these challenges, this study proposes Context-CrackNet, a novel encoder-decoder architecture featuring the Region-Focused Enhancement Module (RFEM) and Context-Aware Global Module (CAGM). These innovations enhance the model's ability to capture fine-grained local details and global contextual dependencies, respectively. Context-CrackNet was rigorously evaluated on ten publicly available crack segmentation datasets, covering diverse pavement distress scenarios. The model consistently outperformed 9 state-of-the-art segmentation frameworks, achieving superior performance metrics such as mIoU and Dice score, while maintaining competitive inference efficiency. Ablation studies confirmed the complementary roles of RFEM and CAGM, with notable improvements in mIoU and Dice score when both modules were integrated. Additionally, the model's balance of precision and computational efficiency highlights its potential for real-time deployment in large-scale pavement monitoring systems.

Via

Access Paper or Ask Questions

PaveSAM Segment Anything for Pavement Distress

Sep 11, 2024

Neema Jakisa Owor, Yaw Adu-Gyamfi, Armstrong Aboah, Mark Amo-Boateng

Figure 1 for PaveSAM Segment Anything for Pavement Distress

Figure 2 for PaveSAM Segment Anything for Pavement Distress

Figure 3 for PaveSAM Segment Anything for Pavement Distress

Figure 4 for PaveSAM Segment Anything for Pavement Distress

Abstract:Automated pavement monitoring using computer vision can analyze pavement conditions more efficiently and accurately than manual methods. Accurate segmentation is essential for quantifying the severity and extent of pavement defects and consequently, the overall condition index used for prioritizing rehabilitation and maintenance activities. Deep learning-based segmentation models are however, often supervised and require pixel-level annotations, which can be costly and time-consuming. While the recent evolution of zero-shot segmentation models can generate pixel-wise labels for unseen classes without any training data, they struggle with irregularities of cracks and textured pavement backgrounds. This research proposes a zero-shot segmentation model, PaveSAM, that can segment pavement distresses using bounding box prompts. By retraining SAM's mask decoder with just 180 images, pavement distress segmentation is revolutionized, enabling efficient distress segmentation using bounding box prompts, a capability not found in current segmentation models. This not only drastically reduces labeling efforts and costs but also showcases our model's high performance with minimal input, establishing the pioneering use of SAM in pavement distress segmentation. Furthermore, researchers can use existing open-source pavement distress images annotated with bounding boxes to create segmentation masks, which increases the availability and diversity of segmentation pavement distress datasets.

* Road Materials and Pavement Design (2024) 1-25

Via

Access Paper or Ask Questions

Advancing Pavement Distress Detection in Developing Countries: A Novel Deep Learning Approach with Locally-Collected Datasets

Aug 10, 2024

Blessing Agyei Kyem, Eugene Kofi Okrah Denteh, Joshua Kofi Asamoah, Kenneth Adomako Tutu, Armstrong Aboah

Abstract:Road infrastructure maintenance in developing countries faces unique challenges due to resource constraints and diverse environmental factors. This study addresses the critical need for efficient, accurate, and locally-relevant pavement distress detection methods in these regions. We present a novel deep learning approach combining YOLO (You Only Look Once) object detection models with a Convolutional Block Attention Module (CBAM) to simultaneously detect and classify multiple pavement distress types. The model demonstrates robust performance in detecting and classifying potholes, longitudinal cracks, alligator cracks, and raveling, with confidence scores ranging from 0.46 to 0.93. While some misclassifications occur in complex scenarios, these provide insights into unique challenges of pavement assessment in developing countries. Additionally, we developed a web-based application for real-time distress detection from images and videos. This research advances automated pavement distress detection and provides a tailored solution for developing countries, potentially improving road safety, optimizing maintenance strategies, and contributing to sustainable transportation infrastructure development.

Via

Access Paper or Ask Questions

PaveCap: The First Multimodal Framework for Comprehensive Pavement Condition Assessment with Dense Captioning and PCI Estimation

Aug 07, 2024

Blessing Agyei Kyem, Eugene Kofi Okrah Denteh, Joshua Kofi Asamoah, Armstrong Aboah

Abstract:This research introduces the first multimodal approach for pavement condition assessment, providing both quantitative Pavement Condition Index (PCI) predictions and qualitative descriptions. We introduce PaveCap, a novel framework for automated pavement condition assessment. The framework consists of two main parts: a Single-Shot PCI Estimation Network and a Dense Captioning Network. The PCI Estimation Network uses YOLOv8 for object detection, the Segment Anything Model (SAM) for zero-shot segmentation, and a four-layer convolutional neural network to predict PCI. The Dense Captioning Network uses a YOLOv8 backbone, a Transformer encoder-decoder architecture, and a convolutional feed-forward module to generate detailed descriptions of pavement conditions. To train and evaluate these networks, we developed a pavement dataset with bounding box annotations, textual annotations, and PCI values. The results of our PCI Estimation Network showed a strong positive correlation (0.70) between predicted and actual PCIs, demonstrating its effectiveness in automating condition assessment. Also, the Dense Captioning Network produced accurate pavement condition descriptions, evidenced by high BLEU (0.7445), GLEU (0.5893), and METEOR (0.7252) scores. Additionally, the dense captioning model handled complex scenarios well, even correcting some errors in the ground truth data. The framework developed here can greatly improve infrastructure management and decision18 making in pavement maintenance.

Via

Access Paper or Ask Questions

Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets

Apr 15, 2024

Dai Quoc Tran, Armstrong Aboah, Yuntae Jeon, Maged Shoman, Minsoo Park, Seunghee Park

Figure 1 for Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets

Figure 2 for Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets

Figure 3 for Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets

Figure 4 for Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets

Abstract:This study addresses the evolving challenges in urban traffic monitoring detection systems based on fisheye lens cameras by proposing a framework that improves the efficacy and accuracy of these systems. In the context of urban infrastructure and transportation management, advanced traffic monitoring systems have become critical for managing the complexities of urbanization and increasing vehicle density. Traditional monitoring methods, which rely on static cameras with narrow fields of view, are ineffective in dynamic urban environments, necessitating the installation of multiple cameras, which raises costs. Fisheye lenses, which were recently introduced, provide wide and omnidirectional coverage in a single frame, making them a transformative solution. However, issues such as distorted views and blurriness arise, preventing accurate object detection on these images. Motivated by these challenges, this study proposes a novel approach that combines a ransformer-based image enhancement framework and ensemble learning technique to address these challenges and improve traffic monitoring accuracy, making significant contributions to the future of intelligent traffic management systems. Our proposed methodological framework won 5th place in the 2024 AI City Challenge, Track 4, with an F1 score of 0.5965 on experimental validation data. The experimental results demonstrate the effectiveness, efficiency, and robustness of the proposed system. Our code is publicly available at https://github.com/daitranskku/AIC2024-TRACK4-TEAM15.

Via

Access Paper or Ask Questions

Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis

Apr 12, 2024

Maged Shoman, Dongdong Wang, Armstrong Aboah, Mohamed Abdel-Aty

Figure 1 for Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis

Figure 2 for Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis

Figure 3 for Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis

Figure 4 for Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis

Abstract:This paper introduces our solution for Track 2 in AI City Challenge 2024. The task aims to solve traffic safety description and analysis with the dataset of Woven Traffic Safety (WTS), a real-world Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding. Our solution mainly focuses on the following points: 1) To solve dense video captioning, we leverage the framework of dense video captioning with parallel decoding (PDVC) to model visual-language sequences and generate dense caption by chapters for video. 2) Our work leverages CLIP to extract visual features to more efficiently perform cross-modality training between visual and textual representations. 3) We conduct domain-specific model adaptation to mitigate domain shift problem that poses recognition challenge in video understanding. 4) Moreover, we leverage BDD-5K captioned videos to conduct knowledge transfer for better understanding WTS videos and more accurate captioning. Our solution has yielded on the test set, achieving 6th place in the competition. The open source code will be available at https://github.com/UCF-SST-Lab/AICity2024CVPRW

Via

Access Paper or Ask Questions