Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shing Shin Cheng

MrTrack: Register Mamba for Needle Tracking with Rapid Reciprocating Motion during Ultrasound-Guided Aspiration Biopsy

May 14, 2025

Yuelin Zhang, Qingpeng Ding, Long Lei, Yongxuan Feng, Raymond Shing-Yan Tang, Shing Shin Cheng

Abstract:Ultrasound-guided fine needle aspiration (FNA) biopsy is a common minimally invasive diagnostic procedure. However, an aspiration needle tracker addressing rapid reciprocating motion is still missing. MrTrack, an aspiration needle tracker with a mamba-based register mechanism, is proposed. MrTrack leverages a Mamba-based register extractor to sequentially distill global context from each historical search map, storing these temporal cues in a register bank. The Mamba-based register retriever then retrieves temporal prompts from the register bank to provide external cues when current vision features are temporarily unusable due to rapid reciprocating motion and imaging degradation. A self-supervised register diversify loss is proposed to encourage feature diversity and dimension independence within the learned register, mitigating feature collapse. Comprehensive experiments conducted on both motorized and manual aspiration datasets demonstrate that MrTrack not only outperforms state-of-the-art trackers in accuracy and robustness but also achieves superior inference efficiency.

* Early Accepted by MICCAI 2025

Via

Access Paper or Ask Questions

Theoretical Data-Driven MobilePosenet: Lightweight Neural Network for Accurate Calibration-Free 5-DOF Magnet Localization

Jan 06, 2025

Wenxuan Xie, Yuelin Zhang, Jiwei Shan, Hongzhe Sun, Jiewen Tan, Shing Shin Cheng

Figure 1 for Theoretical Data-Driven MobilePosenet: Lightweight Neural Network for Accurate Calibration-Free 5-DOF Magnet Localization

Figure 2 for Theoretical Data-Driven MobilePosenet: Lightweight Neural Network for Accurate Calibration-Free 5-DOF Magnet Localization

Figure 3 for Theoretical Data-Driven MobilePosenet: Lightweight Neural Network for Accurate Calibration-Free 5-DOF Magnet Localization

Figure 4 for Theoretical Data-Driven MobilePosenet: Lightweight Neural Network for Accurate Calibration-Free 5-DOF Magnet Localization

Abstract:Permanent magnet tracking using the external sensor array is crucial for the accurate localization of wireless capsule endoscope robots. Traditional tracking algorithms, based on the magnetic dipole model and Levenberg-Marquardt (LM) algorithm, face challenges related to computational delays and the need for initial position estimation. More recently proposed neural network-based approaches often require extensive hardware calibration and real-world data collection, which are time-consuming and labor-intensive. To address these challenges, we propose MobilePosenet, a lightweight neural network architecture that leverages depthwise separable convolutions to minimize computational cost and a channel attention mechanism to enhance localization accuracy. Besides, the inputs to the network integrate the sensors' coordinate information and random noise, compensating for the discrepancies between the theoretical model and the actual magnetic fields and thus allowing MobilePosenet to be trained entirely on theoretical data. Experimental evaluations conducted in a $90 \times 90 \times 80$ mm workspace demonstrate that MobilePosenet exhibits excellent 5-DOF localization accuracy ($1.54 \pm 1.03$ mm and $2.24 \pm 1.84^{\circ}$) and inference speed (0.9 ms) against state-of-the-art methods trained on real-world data. Since network training relies solely on theoretical data, MobilePosenet can eliminate the hardware calibration and real-world data collection process, improving the generalizability of this permanent magnet localization method and the potential for rapid adoption in different clinical settings.

* 9 pages, 5 figures

Via

Access Paper or Ask Questions

Deformable Gaussian Splatting for Efficient and High-Fidelity Reconstruction of Surgical Scenes

Jan 02, 2025

Jiwei Shan, Zeyu Cai, Cheng-Tai Hsieh, Shing Shin Cheng, Hesheng Wang

Figure 1 for Deformable Gaussian Splatting for Efficient and High-Fidelity Reconstruction of Surgical Scenes

Figure 2 for Deformable Gaussian Splatting for Efficient and High-Fidelity Reconstruction of Surgical Scenes

Figure 3 for Deformable Gaussian Splatting for Efficient and High-Fidelity Reconstruction of Surgical Scenes

Figure 4 for Deformable Gaussian Splatting for Efficient and High-Fidelity Reconstruction of Surgical Scenes

Abstract:Efficient and high-fidelity reconstruction of deformable surgical scenes is a critical yet challenging task. Building on recent advancements in 3D Gaussian splatting, current methods have seen significant improvements in both reconstruction quality and rendering speed. However, two major limitations remain: (1) difficulty in handling irreversible dynamic changes, such as tissue shearing, which are common in surgical scenes; and (2) the lack of hierarchical modeling for surgical scene deformation, which reduces rendering speed. To address these challenges, we introduce EH-SurGS, an efficient and high-fidelity reconstruction algorithm for deformable surgical scenes. We propose a deformation modeling approach that incorporates the life cycle of 3D Gaussians, effectively capturing both regular and irreversible deformations, thus enhancing reconstruction quality. Additionally, we present an adaptive motion hierarchy strategy that distinguishes between static and deformable regions within the surgical scene. This strategy reduces the number of 3D Gaussians passing through the deformation field, thereby improving rendering speed. Extensive experiments demonstrate that our method surpasses existing state-of-the-art approaches in both reconstruction quality and rendering speed. Ablation studies further validate the effectiveness and necessity of our proposed components. We will open-source our code upon acceptance of the paper.

* 7 pages, 4 figures, submitted to ICRA 2025

Via

Access Paper or Ask Questions

MambaXCTrack: Mamba-based Tracker with SSM Cross-correlation and Motion Prompt for Ultrasound Needle Tracking

Nov 13, 2024

Yuelin Zhang, Qingpeng Ding, Long Lei, Jiwei Shan, Wenxuan Xie, Tianyi Zhang, Wanquan Yan, Raymond Shing-Yan Tang, Shing Shin Cheng

Figure 1 for MambaXCTrack: Mamba-based Tracker with SSM Cross-correlation and Motion Prompt for Ultrasound Needle Tracking

Figure 2 for MambaXCTrack: Mamba-based Tracker with SSM Cross-correlation and Motion Prompt for Ultrasound Needle Tracking

Figure 3 for MambaXCTrack: Mamba-based Tracker with SSM Cross-correlation and Motion Prompt for Ultrasound Needle Tracking

Figure 4 for MambaXCTrack: Mamba-based Tracker with SSM Cross-correlation and Motion Prompt for Ultrasound Needle Tracking

Abstract:Ultrasound (US)-guided needle insertion is widely employed in percutaneous interventions. However, providing feedback on the needle tip position via US image presents challenges due to noise, artifacts, and the thin imaging plane of US, which degrades needle features and leads to intermittent tip visibility. In this paper, a Mamba-based US needle tracker MambaXCTrack utilizing structured state space models cross-correlation (SSMX-Corr) and implicit motion prompt is proposed, which is the first application of Mamba in US needle tracking. The SSMX-Corr enhances cross-correlation by long-range modeling and global searching of distant semantic features between template and search maps, benefiting the tracking under noise and artifacts by implicitly learning potential distant semantic cues. By combining with cross-map interleaved scan (CIS), local pixel-wise interaction with positional inductive bias can also be introduced to SSMX-Corr. The implicit low-level motion descriptor is proposed as a non-visual prompt to enhance tracking robustness, addressing the intermittent tip visibility problem. Extensive experiments on a dataset with motorized needle insertion in both phantom and tissue samples demonstrate that the proposed tracker outperforms other state-of-the-art trackers while ablation studies further highlight the effectiveness of each proposed tracking module.

* This work has been submitted to the IEEE for possible publication

Via

Access Paper or Ask Questions

Refined Motion Compensation with Soft Laser Manipulators using Data-Driven Surrogate Models

Jul 02, 2024

Yongjun Yan, Qingpeng Ding, Mingwu Li, Junyan Yan, Shing Shin Cheng

Figure 1 for Refined Motion Compensation with Soft Laser Manipulators using Data-Driven Surrogate Models

Figure 2 for Refined Motion Compensation with Soft Laser Manipulators using Data-Driven Surrogate Models

Figure 3 for Refined Motion Compensation with Soft Laser Manipulators using Data-Driven Surrogate Models

Figure 4 for Refined Motion Compensation with Soft Laser Manipulators using Data-Driven Surrogate Models

Abstract:Non-contact laser ablation, a precise thermal technique, simultaneously cuts and coagulates tissue without the insertion errors associated with rigid needles. Human organ motions, such as those in the liver, exhibit rhythmic components influenced by respiratory and cardiac cycles, making effective laser energy delivery to target lesions while compensating for tumor motion crucial. This research introduces a data-driven method to derive surrogate models of a soft manipulator. These low-dimensional models offer computational efficiency when integrated into the Model Predictive Control (MPC) framework, while still capturing the manipulator's dynamics with and without control input. Spectral Submanifolds (SSM) theory models the manipulator's autonomous dynamics, acknowledging its tendency to reach equilibrium when external forces are removed. Preliminary results show that the MPC controller using the surrogate model outperforms two other models within the same MPC framework. The data-driven MPC controller also supports a design-agnostic feature, allowing the interchangeability of different soft manipulators within the laser ablation surgery robot system.

Via

Access Paper or Ask Questions

Simultaneous Estimation of Shape and Force along Highly Deformable Surgical Manipulators Using Sparse FBG Measurement

Apr 25, 2024

Yiang Lu, Bin Li, Wei Chen, Junyan Yan, Shing Shin Cheng, Jiangliu Wang, Jianshu Zhou, Qi Dou, Yun-hui Liu

Figure 1 for Simultaneous Estimation of Shape and Force along Highly Deformable Surgical Manipulators Using Sparse FBG Measurement

Figure 2 for Simultaneous Estimation of Shape and Force along Highly Deformable Surgical Manipulators Using Sparse FBG Measurement

Figure 3 for Simultaneous Estimation of Shape and Force along Highly Deformable Surgical Manipulators Using Sparse FBG Measurement

Figure 4 for Simultaneous Estimation of Shape and Force along Highly Deformable Surgical Manipulators Using Sparse FBG Measurement

Abstract:Recently, fiber optic sensors such as fiber Bragg gratings (FBGs) have been widely investigated for shape reconstruction and force estimation of flexible surgical robots. However, most existing approaches need precise model parameters of FBGs inside the fiber and their alignments with the flexible robots for accurate sensing results. Another challenge lies in online acquiring external forces at arbitrary locations along the flexible robots, which is highly required when with large deflections in robotic surgery. In this paper, we propose a novel data-driven paradigm for simultaneous estimation of shape and force along highly deformable flexible robots by using sparse strain measurement from a single-core FBG fiber. A thin-walled soft sensing tube helically embedded with FBG sensors is designed for a robotic-assisted flexible ureteroscope with large deflection up to 270 degrees and a bend radius under 10 mm. We introduce and study three learning models by incorporating spatial strain encoders, and compare their performances in both free space and constrained environments with contact forces at different locations. The experimental results in terms of dynamic shape-force sensing accuracy demonstrate the effectiveness and superiority of the proposed methods.

* Accepted to ICRA 2024

Via

Access Paper or Ask Questions

Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy

Mar 08, 2024

Yuelin Zhang, Wanquan Yan, Kim Yan, Chun Ping Lam, Yufu Qiu, Pengyu Zheng, Raymond Shing-Yan Tang, Shing Shin Cheng

Abstract:Gastric simulators with objective educational feedback have been proven useful for endoscopy training. Existing electronic simulators with feedback are however not commonly adopted due to their high cost. In this work, a motion-guided dual-camera tracker is proposed to provide reliable endoscope tip position feedback at a low cost inside a mechanical simulator for endoscopy skill evaluation, tackling several unique challenges. To address the issue of significant appearance variation of the endoscope tip while keeping dual-camera tracking consistency, the cross-camera mutual template strategy (CMT) is proposed to introduce dynamic transient mutual templates to dual-camera tracking. To alleviate disturbance from large occlusion and distortion by the light source from the endoscope tip, the Mamba-based motion-guided prediction head (MMH) is presented to aggregate visual tracking with historical motion information modeled by the state space model. The proposed tracker was evaluated on datasets captured by low-cost camera pairs during endoscopy procedures performed inside the mechanical simulator. The tracker achieves SOTA performance with robust and consistent tracking on dual cameras. Further downstream evaluation proves that the 3D tip position determined by the proposed tracker enables reliable skill differentiation. The code and dataset will be released upon acceptance.

Via

Access Paper or Ask Questions

A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning

Mar 05, 2024

Yuelin Zhang, Pengyu Zheng, Wanquan Yan, Chengyu Fang, Shing Shin Cheng

Figure 1 for A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning

Figure 2 for A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning

Figure 3 for A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning

Figure 4 for A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning

Abstract:Defocus blur is a persistent problem in microscope imaging that poses harm to pathology interpretation and medical intervention in cell microscopy and microscope surgery. To address this problem, a unified framework including multi-pyramid transformer (MPT) and extended frequency contrastive regularization (EFCR) is proposed to tackle two outstanding challenges in microscopy deblur: longer attention span and feature deficiency. The MPT employs an explicit pyramid structure at each network stage that integrates the cross-scale window attention (CSWA), the intra-scale channel attention (ISCA), and the feature-enhancing feed-forward network (FEFN) to capture long-range cross-scale spatial interaction and global channel context. The EFCR addresses the feature deficiency problem by exploring latent deblur signals from different frequency bands. It also enables deblur knowledge transfer to learn cross-domain information from extra data, improving deblur performance for labeled and unlabeled data. Extensive experiments and downstream task validation show the framework achieves state-of-the-art performance across multiple datasets. Project page: https://github.com/PieceZhang/MPT-CataBlur.

* Accepted in CVPR 2024

Via

Access Paper or Ask Questions

Tele-Operated Oropharyngeal Swab (TOOS) RobotEnabled by TSS Soft Hand for Safe and EffectiveCOVID-19 OP Sampling

Sep 20, 2021

Wei Chen, Jianshu Zhou, Shing Shin Cheng, Yiang Lu, Fangxun Zhong, Yuan Gao, Yaqing Wang, Lingbin Xue, Michael C. F. Tong, Yun-Hui Liu

Figure 1 for Tele-Operated Oropharyngeal Swab (TOOS) RobotEnabled by TSS Soft Hand for Safe and EffectiveCOVID-19 OP Sampling

Figure 2 for Tele-Operated Oropharyngeal Swab (TOOS) RobotEnabled by TSS Soft Hand for Safe and EffectiveCOVID-19 OP Sampling

Figure 3 for Tele-Operated Oropharyngeal Swab (TOOS) RobotEnabled by TSS Soft Hand for Safe and EffectiveCOVID-19 OP Sampling

Figure 4 for Tele-Operated Oropharyngeal Swab (TOOS) RobotEnabled by TSS Soft Hand for Safe and EffectiveCOVID-19 OP Sampling

Abstract:The COVID-19 pandemic has imposed serious challenges in multiple perspectives of human life. To diagnose COVID-19, oropharyngeal swab (OP SWAB) sampling is generally applied for viral nucleic acid (VNA) specimen collection. However, manual sampling exposes medical staff to a high risk of infection. Robotic sampling is promising to mitigate this risk to the minimum level, but traditional robot suffers from safety, cost, and control complexity issues for wide-scale deployment. In this work, we present soft robotic technology is promising to achieve robotic OP swab sampling with excellent swab manipulability in a confined oral space and works as dexterous as existing manual approach. This is enabled by a novel Tstone soft (TSS) hand, consisting of a soft wrist and a soft gripper, designed from human sampling observation and bio-inspiration. TSS hand is in a compact size, exerts larger workspace, and achieves comparable dexterity compared to human hand. The soft wrist is capable of agile omnidirectional bending with adjustable stiffness. The terminal soft gripper is effective for disposable swab pinch and replacement. The OP sampling force is easy to be maintained in a safe and comfortable range (throat sampling comfortable region) under a hybrid motion and stiffness virtual fixture-based controller. A dedicated 3 DOFs RCM platform is used for TSS hand global positioning. Design, modeling, and control of the TSS hand are discussed in detail with dedicated experimental validations. A sampling test based on human tele-operation is processed on the oral cavity model with excellent success rate. The proposed TOOS robot demonstrates a highly promising solution for tele-operated, safe, cost-effective, and quick deployable COVID-19 OP swab sampling.

Via

Access Paper or Ask Questions

Towards Safe Control of Continuum Manipulator Using Shielded Multiagent Reinforcement Learning

Jun 15, 2021

Guanglin Ji, Junyan Yan, Jingxin Du, Wanquan Yan, Jibiao Chen, Yongkang Lu, Juan Rojas, Shing Shin Cheng

Figure 1 for Towards Safe Control of Continuum Manipulator Using Shielded Multiagent Reinforcement Learning

Figure 2 for Towards Safe Control of Continuum Manipulator Using Shielded Multiagent Reinforcement Learning

Figure 3 for Towards Safe Control of Continuum Manipulator Using Shielded Multiagent Reinforcement Learning

Figure 4 for Towards Safe Control of Continuum Manipulator Using Shielded Multiagent Reinforcement Learning

Abstract:Continuum robotic manipulators are increasingly adopted in minimal invasive surgery. However, their nonlinear behavior is challenging to model accurately, especially when subject to external interaction, potentially leading to poor control performance. In this letter, we investigate the feasibility of adopting a model-free multiagent reinforcement learning (RL), namely multiagent deep Q network (MADQN), to control a 2-degree of freedom (DoF) cable-driven continuum surgical manipulator. The control of the robot is formulated as a one-DoF, one agent problem in the MADQN framework to improve the learning efficiency. Combined with a shielding scheme that enables dynamic variation of the action set boundary, MADQN leads to efficient and importantly safer control of the robot. Shielded MADQN enabled the robot to perform point and trajectory tracking with submillimeter root mean square errors under external loads, soft obstacles, and rigid collision, which are common interaction scenarios encountered by surgical manipulators. The controller was further proven to be effective in a miniature continuum robot with high structural nonlinearitiy, achieving trajectory tracking with submillimeter accuracy under external payload.

* 8 pages, 12 figs, 1 table, 2 pseudo-code

Via

Access Paper or Ask Questions