Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dawar Khan

ClickAIXR: On-Device Multimodal Vision-Language Interaction with Real-World Objects in Extended Reality

Apr 06, 2026

Dawar Khan, Alexandre Kouyoumdjian, Xinyu Liu, Omar Mena, Dominik Engel, Ivan Viola

Abstract:We present ClickAIXR, a novel on-device framework for multimodal vision-language interaction with objects in extended reality (XR). Unlike prior systems that rely on cloud-based AI (e.g., ChatGPT) or gaze-based selection (e.g., GazePointAR), ClickAIXR integrates an on-device vision-language model (VLM) with a controller-based object selection paradigm, enabling users to precisely click on real-world objects in XR. Once selected, the object image is processed locally by the VLM to answer natural language questions through both text and speech. This object-centered interaction reduces ambiguity inherent in gaze- or voice-only interfaces and improves transparency by performing all inference on-device, addressing concerns around privacy and latency. We implemented ClickAIXR in the Magic Leap SDK (C API) with ONNX-based local VLM inference. We conducted a user study comparing ClickAIXR with Gemini 2.5 Flash and ChatGPT 5, evaluating usability, trust, and user satisfaction. Results show that latency is moderate and user experience is acceptable. Our findings demonstrate the potential of click-based object selection combined with on-device AI to advance trustworthy, privacy-preserving XR interactions. The source code and supplementary materials are available at: nanovis.org/ClickAIXR.html

Via

Access Paper or Ask Questions

Biomedical Image Segmentation: A Systematic Literature Review of Deep Learning Based Object Detection Methods

Aug 06, 2024

Fazli Wahid, Yingliang Ma, Dawar Khan, Muhammad Aamir, Syed U. K. Bukhari

Figure 1 for Biomedical Image Segmentation: A Systematic Literature Review of Deep Learning Based Object Detection Methods

Figure 2 for Biomedical Image Segmentation: A Systematic Literature Review of Deep Learning Based Object Detection Methods

Figure 3 for Biomedical Image Segmentation: A Systematic Literature Review of Deep Learning Based Object Detection Methods

Figure 4 for Biomedical Image Segmentation: A Systematic Literature Review of Deep Learning Based Object Detection Methods

Abstract:Biomedical image segmentation plays a vital role in diagnosis of diseases across various organs. Deep learning-based object detection methods are commonly used for such segmentation. There exists an extensive research in this topic. However, there is no standard review on this topic. Existing surveys often lack a standardized approach or focus on broader segmentation techniques. In this paper, we conducted a systematic literature review (SLR), collected and analysed 148 articles that explore deep learning object detection methods for biomedical image segmentation. We critically analyzed these methods, identified the key challenges, and discussed the future directions. From the selected articles we extracted the results including the deep learning models, targeted imaging modalities, targeted diseases, and the metrics for the analysis of the methods. The results have been presented in tabular and/or charted forms. The results are presented in three major categories including two stage detection models, one stage detection models and point-based detection models. Each article is individually analyzed along with its pros and cons. Finally, we discuss open challenges, potential benefits, and future research directions. This SLR aims to provide the research community with a quick yet deeper understanding of these segmentation models, ultimately facilitating the development of more powerful solutions for biomedical image analysis.

Via

Access Paper or Ask Questions

Dr. KID: Direct Remeshing and K-set Isometric Decomposition for Scalable Physicalization of Organic Shapes

Apr 06, 2023

Dawar Khan, Ciril Bohak, Ivan Viola

Figure 1 for Dr. KID: Direct Remeshing and K-set Isometric Decomposition for Scalable Physicalization of Organic Shapes

Figure 2 for Dr. KID: Direct Remeshing and K-set Isometric Decomposition for Scalable Physicalization of Organic Shapes

Figure 3 for Dr. KID: Direct Remeshing and K-set Isometric Decomposition for Scalable Physicalization of Organic Shapes

Figure 4 for Dr. KID: Direct Remeshing and K-set Isometric Decomposition for Scalable Physicalization of Organic Shapes

Abstract:Dr. KID is an algorithm that uses isometric decomposition for the physicalization of potato-shaped organic models in a puzzle fashion. The algorithm begins with creating a simple, regular triangular surface mesh of organic shapes, followed by iterative k-means clustering and remeshing. For clustering, we need similarity between triangles (segments) which is defined as a distance function. The distance function maps each triangle's shape to a single point in the virtual 3D space. Thus, the distance between the triangles indicates their degree of dissimilarity. K-means clustering uses this distance and sorts of segments into k classes. After this, remeshing is applied to minimize the distance between triangles within the same cluster by making their shapes identical. Clustering and remeshing are repeated until the distance between triangles in the same cluster reaches an acceptable threshold. We adopt a curvature-aware strategy to determine the surface thickness and finalize puzzle pieces for 3D printing. Identical hinges and holes are created for assembling the puzzle components. For smoother outcomes, we use triangle subdivision along with curvature-aware clustering, generating curved triangular patches for 3D printing. Our algorithm was evaluated using various models, and the 3D-printed results were analyzed. Findings indicate that our algorithm performs reliably on target organic shapes with minimal loss of input geometry.

Via

Access Paper or Ask Questions