This paper presents a comprehensive evaluation of GPT-4V's capabilities across diverse medical imaging tasks, including Radiology Report Generation, Medical Visual Question Answering (VQA), and Visual Grounding. While prior efforts have explored GPT-4V's performance in medical image analysis, to the best of our knowledge, our study represents the first quantitative evaluation on publicly available benchmarks. Our findings highlight GPT-4V's potential in generating descriptive reports for chest X-ray images, particularly when guided by well-structured prompts. Meanwhile, its performance on the MIMIC-CXR dataset benchmark reveals areas for improvement in certain evaluation metrics, such as CIDEr. In the domain of Medical VQA, GPT-4V demonstrates proficiency in distinguishing between question types but falls short of the VQA-RAD benchmark in terms of accuracy. Furthermore, our analysis finds the limitations of conventional evaluation metrics like the BLEU scores, advocating for the development of more semantically robust assessment methods. In the field of Visual Grounding, GPT-4V exhibits preliminary promise in recognizing bounding boxes, but its precision is lacking, especially in identifying specific medical organs and signs. Our evaluation underscores the significant potential of GPT-4V in the medical imaging domain, while also emphasizing the need for targeted refinements to fully unlock its capabilities.
Residential occupancy detection has become an enabling technology in today's urbanized world for various smart home applications, such as building automation, energy management, and improved security and comfort. Digitalization of the energy system provides smart meter data that can be used for occupancy detection in a non-intrusive manner without causing concerns regarding privacy and data security. In particular, deep learning techniques make it possible to infer occupancy from low-resolution smart meter data, such that the need for accurate occupancy detection with privacy preservation can be achieved. Our work is thus motivated to develop a privacy-aware and effective model for residential occupancy detection in contemporary living environments. Our model aims to leverage the advantages of both recurrent neural networks (RNNs), which are adept at capturing local temporal dependencies, and transformers, which are effective at handling global temporal dependencies. Our designed hybrid transformer-RNN model detects residential occupancy using hourly smart meter data, achieving an accuracy of nearly 92\% across households with diverse profiles. We validate the effectiveness of our method using a publicly accessible dataset and demonstrate its performance by comparing it with state-of-the-art models, including attention-based occupancy detection methods.