Abstract:Bridge infrastructure inspection is a critical but labor-intensive task requiring expert assessment of structural damage such as rebar exposure, cracking, and corrosion. This paper presents a comprehensive study of quantized Vision-Language Models (VLMs) for automated bridge damage assessment, focusing on the trade-offs between description quality, inference speed, and resource requirements. We develop an end-to-end pipeline combining LLaVA-1.5-7B for visual damage analysis, structured JSON extraction, and rule-based priority scoring. To enable deployment on consumer-grade GPUs, we conduct a systematic comparison of three quantization levels: Q4_K_M, Q5_K_M, and Q8\_0 across 254 rebar exposure images. We introduce a 5-point quality evaluation framework assessing damage type recognition, severity classification. Our results demonstrate that Q5_K_M achieves the optimal balance: quality score 3.18$\pm$1.35/5.0, inference time 5.67s/image, and 0.56 quality/sec efficiency -- 8.5% higher quality than Q4_K_M with only 4.5% speed reduction, while matching Q8_0's quality with 25% faster inference. Statistical analysis reveals Q5_K_M exhibits the weakest text-quality correlation (-0.148), indicating consistent performance regardless of description length.
Abstract:This paper presents a systematic methodology for building domain-specific Japanese small language models using QLoRA fine-tuning. We address three core questions: optimal training scale, base-model selection, and architecture-aware quantization. Stage 1 (Training scale): Scale-learning experiments (1k--5k samples) identify n=4,000 as optimal, where test-set NLL reaches minimum (1.127) before overfitting at 5k samples. Stage 2 (Compare finetuned SLMs): Comparing four Japanese LLMs shows that Llama-3 models with Japanese continual pre-training (Swallow-8B, ELYZA-JP-8B) outperform multilingual models (Qwen2.5-7B). Stage 3 (Quantization): Llama-3 architectures improve under Q4_K_M quantization, while GQA architectures degrade severely (Qwen2.5: -0.280 points). Production recommendation: Swallow-8B Q4_K_M achieves 2.830/3 score, 8.9 s/question, 4.9 GB size. The methodology generalizes to low-resource technical domains and provides actionable guidance for compact Japanese specialist LMs on consumer hardware.
Abstract:Bridge periodic inspection records contain sensitive information about public infrastructure, making cross-organizational data sharing impractical under existing data governance constraints. We propose a federated framework for estimating a Continuous-Time Markov Chain (CTMC) hazard model of bridge deterioration, enabling municipalities to collaboratively train a shared benchmark model without transferring raw inspection records. Each User holds local inspection data and trains a log-linear hazard model over three deterioration-direction transitions -- Good$\to$Minor, Good$\to$Severe, and Minor$\to$Severe -- with covariates for bridge age, coastline distance, and deck area. Local optimization is performed via mini-batch stochastic gradient descent on the CTMC log-likelihood, and only a 12-dimensional pseudo-gradient vector is uploaded to a central server per communication round. The server aggregates User updates using sample-weighted Federated Averaging (FedAvg) with momentum and gradient clipping. All experiments in this paper are conducted on fully synthetic data generated from a known ground-truth parameter set with region-specific heterogeneity, enabling controlled evaluation of federated convergence behaviour. Simulation results across heterogeneous Users show consistent convergence of the average negative log-likelihood, with the aggregated gradient norm decreasing as User scale increases. Furthermore, the federated update mechanism provides a natural participation incentive: Users who register their local inspection datasets on a shared technical-standard platform receive in return the periodically updated global benchmark parameters -- information that cannot be obtained from local data alone -- thereby enabling evidence-based life-cycle planning without surrendering data sovereignty.
Abstract:In predictive maintenance of equipment, deep learning-based time series anomaly detection has garnered significant attention; however, pure deep learning approaches often fail to achieve sufficient accuracy on real-world data. This study proposes a hybrid approach that integrates 64-dimensional time series embeddings from Granite TinyTimeMixer with 28-dimensional statistical features based on domain knowledge for HVAC equipment anomaly prediction tasks. Specifically, we combine time series embeddings extracted from a Granite TinyTimeMixer encoder fine-tuned with LoRA (Low-Rank Adaptation) and 28 types of statistical features including trend, volatility, and drawdown indicators, which are then learned using a LightGBM gradient boosting classifier. In experiments using 64 equipment units and 51,564 samples, we achieved Precision of 91--95\% and ROC-AUC of 0.995 for anomaly prediction at 30-day, 60-day, and 90-day horizons. Furthermore, we achieved production-ready performance with a false positive rate of 1.1\% or less and a detection rate of 88--94\%, demonstrating the effectiveness of the system for predictive maintenance applications. This work demonstrates that practical anomaly detection systems can be realized by leveraging the complementary strengths between deep learning's representation learning capabilities and statistical feature engineering.
Abstract:In Japan, civil infrastructure condition monitoring is mandated through visual inspection every five years. Field-captured damage images frequently contain concrete cracks and rebar exposure, often accompanied by construction signs revealing regional information. To enable safe infrastructure use without causing public anxiety, it is essential to protect regional information while accurately extracting damage features and visualizing key indicators for repair decision-making. This paper presents an open-source bridge damage detection system with regional privacy protection capabilities. We employ Segment Anything Model (SAM) 3 for rebar corrosion detection and utilize DBSCAN for automatic completion of missed regions. Construction sign regions are detected and protected through Gaussian blur. Four preprocessing methods improve OCR accuracy, and GPU optimization enables 1.7-second processing per image. The technology stack includes SAM3, PyTorch, OpenCV, pytesseract, and scikit-learn, achieving efficient bridge inspection with regional information protection.




Abstract:In regenerative medicine research, we experimentally design the composition of chemical medium. We add different components to 384-well plates and culture the biological cells. We monitor the condition of the cells and take time-lapse bioimages for morphological assay. In particular, precipitation can appear as artefacts in the image and contaminate the noise in the imaging assay. Inspecting precipitates is a tedious task for the observer, and differences in experience can lead to variations in judgement from person to person. The machine learning approach will remove the burden of human inspection and provide consistent inspection. In addition, precipitation features are as small as 10-20 {\mu}m. A 1200 pixel square well image resized under a resolution of 2.82 {\mu}m/pixel will result in a reduction in precipitation features. Dividing the well images into 240-pixel squares and learning without resizing preserves the resolution of the original image. In this study, we developed an application to automatically detect precipitation on 384-well plates utilising optical microscope images. We apply MN-pair contrastive clustering to extract precipitation classes from approximately 20,000 patch images. To detect precipitation features, we compare deeper FCDDs detectors with optional backbones and build a machine learning pipeline to detect precipitation from the maximum score of quadruplet well images using isolation Forest algorithm, where the anomaly score is ranged from zero to one. Furthermore, using this application we can visualise precipitation situ heatmap on a 384-well plate.
Abstract:In past decade, previous balanced datasets have been used to advance algorithms for classification, object detection, semantic segmentation, and anomaly detection in industrial applications. Specifically, for condition-based maintenance, automating visual inspection is crucial to ensure high quality. Deterioration prognostic attempts to optimize the fine decision process for predictive maintenance and proactive repair. In civil infrastructure and living environment, damage data mining cannot avoid the imbalanced data issue because of rare unseen events and high quality status by improved operations. For visual inspection, deteriorated class acquired from the surface of concrete and steel components are occasionally imbalanced. From numerous related surveys, we summarize that imbalanced data problems can be categorized into four types; 1) missing range of target and label valuables, 2) majority-minority class imbalance, 3) foreground-background of spatial imbalance, 4) long-tailed class of pixel-wise imbalance. Since 2015, there has been many imbalanced studies using deep learning approaches that includes regression, image classification, object detection, semantic segmentation. However, anomaly detection for imbalanced data is not yet well known. In the study, we highlight one-class anomaly detection application whether anomalous class or not, and demonstrate clear examples on imbalanced vision datasets: blood smear, lung infection, hazardous driving, wooden, concrete deterioration, river sludge, and disaster damage. Illustrated in Fig.1, we provide key results on damage vision mining advantage, hypothesizing that the more effective range of positive ratio, the higher accuracy gain of anomaly detection application. In our imbalanced studies, compared with the balanced case of positive ratio 1/1, we find that there is applicable positive ratio, where the accuracy are consistently high.
Abstract:Extreme natural disasters can have devastating effects on both urban and rural areas. In any disaster event, an initial response is the key to rescue within 72 hours and prompt recovery. During the initial stage of disaster response, it is important to quickly assess the damage over a wide area and identify priority areas. Among machine learning algorithms, deep anomaly detection is effective in detecting devastation features that are different from everyday features. In addition, explainable computer vision applications should justify the initial responses. In this paper, we propose an anomaly detection application utilizing deeper fully convolutional data descriptions (FCDDs), that enables the localization of devastation features and visualization of damage-marked heatmaps. More specifically, we show numerous training and test results for a dataset AIDER with the four disaster categories: collapsed buildings, traffic incidents, fires, and flooded areas. We also implement ablation studies of anomalous class imbalance and the data scale competing against the normal class. Our experiments provide results of high accuracies over 95% for F1. Furthermore, we found that the deeper FCDD with a VGG16 backbone consistently outperformed other baselines CNN27, ResNet101, and Inceptionv3. This study presents a new solution that offers a disaster anomaly detection application for initial responses with higher accuracy and devastation explainability, providing a novel contribution to the prompt disaster recovery problem in the research area of anomaly scene understanding. Finally, we discuss future works to improve more robust, explainable applications for effective initial responses.
Abstract:Maintaining high standards for user safety during daily railway operations is crucial for railway managers. To aid in this endeavor, top- or side-view cameras and GPS positioning systems have facilitated progress toward automating periodic inspections of defective features and assessing the deteriorating status of railway components. However, collecting data on deteriorated status can be time-consuming and requires repeated data acquisition because of the extreme temporal occurrence imbalance. In supervised learning, thousands of paired data sets containing defective raw images and annotated labels are required. However, the one-class classification approach offers the advantage of requiring fewer images to optimize parameters for training normal and anomalous features. The deeper fully-convolutional data descriptions (FCDDs) were applicable to several damage data sets of concrete/steel components in structures, and fallen tree, and wooden building collapse in disasters. However, it is not yet known to feasible to railway components. In this study, we devised a prognostic discriminator pipeline to automate one-class damage classification using the deeper FCDDs for defective railway components. We also performed sensitivity analysis of the deeper backbone and receptive field based on convolutional neural networks (CNNs). Furthermore, we visualized defective railway features by using transposed Gaussian upsampling. We demonstrated our application to railway inspection using a video acquisition dataset of railway track in forward view that contains wooden sleeper deterioration in rural railways. Finally, we examined the usability of our approach for prognostic monitoring and future work on railway component inspection.
Abstract:It is important for infrastructure managers to maintain a high standard to ensure user satisfaction during a lifecycle of infrastructures. Surveillance cameras and visual inspections have enabled progress toward automating the detection of anomalous features and assessing the occurrence of the deterioration. Frequently, collecting damage data constraints time consuming and repeated inspections. One-class damage detection approach has a merit that only the normal images enables us to optimize the parameters. Simultaneously, the visual explanation using the heat map enable us to understand the localized anomalous feature. We propose a prototype to automate one-class damage detection using the fully-convolutional data description (FCDD). We also visualize the explanation of the damage feature using the up-sampling-based activation map with the Gaussian up-sampling from the receptive field of the fully convolutional network (FCN). We demonstrate it in experimental studies: concrete damage and steel corrosion and mention its usefulness and future works.