Abstract:Regulatory affairs, which sits at the intersection of medicine and law, can benefit significantly from AI-enabled automation. Classification task is the initial step in which manufacturers position their products to regulatory authorities, and it plays a critical role in determining market access, regulatory scrutiny, and ultimately, patient safety. In this study, we investigate a broad range of AI models -- including traditional machine learning (ML) algorithms, deep learning architectures, and large language models -- using a regulatory dataset of medical device descriptions. We evaluate each model along three key dimensions: accuracy, interpretability, and computational cost.
Abstract:Accurate classification of medical device risk levels is essential for regulatory oversight and clinical safety. We present a Transformer-based multimodal framework that integrates textual descriptions and visual information to predict device regulatory classification. The model incorporates a cross-attention mechanism to capture intermodal dependencies and employs a self-training strategy for improved generalization under limited supervision. Experiments on a real-world regulatory dataset demonstrate that our approach achieves up to 90.4% accuracy and 97.9% AUROC, significantly outperforming text-only (77.2%) and image-only (54.8%) baselines. Compared to standard multimodal fusion, the self-training mechanism improved SVM performance by 3.3 percentage points in accuracy (from 87.1% to 90.4%) and 1.4 points in macro-F1, suggesting that pseudo-labeling can effectively enhance generalization under limited supervision. Ablation studies further confirm the complementary benefits of both cross-modal attention and self-training.
Abstract:Understanding the text in legal documents can be challenging due to their complex structure and the inclusion of domain-specific jargon. Laws and regulations are often crafted in such a manner that engagement with them requires formal training, potentially leading to vastly different interpretations of the same texts. Linguistic complexity is an important contributor to the difficulties experienced by readers. Simplifying texts could enhance comprehension across a broader audience, not just among trained professionals. Various metrics have been developed to measure document readability. Therefore, we adopted a systematic review approach to examine the linguistic and readability metrics currently employed for legal and regulatory texts. A total of 3566 initial papers were screened, with 34 relevant studies found and further assessed. Our primary objective was to identify which current metrics were applied for evaluating readability within the legal field. Sixteen different metrics were identified, with the Flesch-Kincaid Grade Level being the most frequently used method. The majority of studies (73.5%) were found in the domain of "informed consent forms". From the analysis, it is clear that not all legal domains are well represented in terms of readability metrics and that there is a further need to develop more consensus on which metrics should be applied for legal documents.
Abstract:Artificial intelligence (AI) in medical device software (MDSW) represents a transformative clinical technology, attracting increasing attention within both the medical community and the regulators. In this study, we leverage a data-driven approach to automatically extract and analyze AI-enabled medical devices (AIMD) from the National Medical Products Administration (NMPA) regulatory database. The continued increase in publicly available regulatory data requires scalable methods for analysis. Automation of regulatory information screening is essential to create reproducible insights that can be quickly updated in an ever changing medical device landscape. More than 4 million entries were assessed, identifying 2,174 MDSW registrations, including 531 standalone applications and 1,643 integrated within medical devices, of which 43 were AI-enabled. It was shown that the leading medical specialties utilizing AIMD include respiratory (20.5%), ophthalmology/endocrinology (12.8%), and orthopedics (10.3%). This approach greatly improves the speed of data extracting providing a greater ability to compare and contrast. This study provides the first extensive, data-driven exploration of AIMD in China, showcasing the potential of automated regulatory data analysis in understanding and advancing the landscape of AI in medical technology.
Abstract:This study investigates the complexity of regulatory affairs in the medical device industry, a critical factor influencing market access and patient care. Through qualitative research, we sought expert insights to understand the factors contributing to this complexity. The study involved semi-structured interviews with 28 professionals from medical device companies, specializing in various aspects of regulatory affairs. These interviews were analyzed using open coding and Natural Language Processing (NLP) techniques. The findings reveal key sources of complexity within the regulatory landscape, divided into five domains: (A) Regulatory language complexity, (B) Intricacies within the regulatory process, (C) Global-level complexities, (D) Database-related considerations, and (E) Product-level issues. The participants highlighted the need for strategies to streamline regulatory compliance, enhance interactions between regulatory bodies and industry players, and develop adaptable frameworks for rapid technological advancements. Emphasizing interdisciplinary collaboration and increased transparency, the study concludes that these elements are vital for establishing coherent and effective regulatory procedures in the medical device sector.