Abstract:Human-object interaction (HOI) recognition is critical for automatically analyzing student behavior in complex educational environments. Although state-of-the-art (SOTA) HOI detectors perform well on benchmark datasets, their performance often degrades when deployed in real-world training environments due to domain-specific objects, occlusions, and complex visual conditions. In this paper, we introduce a diagnosis-driven framework that integrates a triplet-level HOI error taxonomy with error-factor attribution analysis for real-world educational video data. We study this problem in the context of Critical Care Air Transport Team (CCATT) mixed-reality medical training. Based on an analysis of HOI failure modes and their causes, we develop a diagnosis-informed refinement strategy for adapting pretrained HOI models to the target domain. Experiments on the CCATT dataset show that this approach improves the macro-F1 score of a pretrained CDN model from 48.6 to 90.2 through targeted refinement guided by diagnosed error factors. These results highlight the value of detailed diagnostic analysis for informing targeted adaptation of HOI models in real-world educational environments.
Abstract:With advanced AI, while every industry is growing at rocket speed, the smart home industry has not reached the next generation. There is still a huge leap of innovation that needs to happen before we call a home a Smart home. A Smart home should predict residents' needs and fulfill them in a timely manner. One of the important tasks of maintaining a home is timely grocery tracking and supply maintenance. Grocery tracking models are very famous in the retail industry but they are nonexistent in the common household. Groceries detection in household refrigerators or storage closets is very complicated compared to retail shelving data. In this paper, home grocery tracking problem is resolved by combining retail shelving data and fruits dataset with real-time 360 view data points collected from home groceries storage. By integrating this vision-based object detection system along with supply chain and user food interest prediction systems, complete automation of groceries ordering can be achieved.