Abstract:Teeth landmark detection is a critical task in modern clinical orthodontics. Their precise identification enables advanced diagnostics, facilitates personalized treatment strategies, and supports more effective monitoring of treatment progress in clinical dentistry. However, several significant challenges may arise due to the intricate geometry of individual teeth and the substantial variations observed across different individuals. To address these complexities, the development of advanced techniques, especially through the application of deep learning, is essential for the precise and reliable detection of 3D tooth landmarks. In this context, the 3DTeethLand challenge was held in collaboration with the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) in 2024, calling for algorithms focused on teeth landmark detection from intraoral 3D scans. This challenge introduced the first publicly available dataset for 3D teeth landmark detection, offering a valuable resource to assess the state-of-the-art methods in this task and encourage the community to provide methodological contributions towards the resolution of their problem with significant clinical implications.
Abstract:The increasing availability of intraoral scanning devices has heightened their importance in modern clinical orthodontics. Clinicians utilize advanced Computer-Aided Design techniques to create patient-specific treatment plans that include laboriously identifying crucial landmarks such as cusps, mesial-distal locations, facial axis points, and tooth-gingiva boundaries. Detecting such landmarks automatically presents challenges, including limited dataset sizes, significant anatomical variability among subjects, and the geometric nature of the data. We present our experiments from the 3DTeethLand Grand Challenge at MICCAI 2024. Our method leverages recent advancements in point cloud learning through transformer architectures. We designed a Point Transformer v3 inspired module to capture meaningful geometric and anatomical features, which are processed by a lightweight decoder to predict per-point distances, further processed by graph-based non-minima suppression. We report promising results and discuss insights on learned feature interpretability.