Alert button
Picture for Yao Shen

Yao Shen

Alert button

MECPformer: Multi-estimations Complementary Patch with CNN-Transformers for Weakly Supervised Semantic Segmentation

Mar 19, 2023
Chunmeng Liu, Guangyao Li, Yao Shen, Ruiqi Wang

Figure 1 for MECPformer: Multi-estimations Complementary Patch with CNN-Transformers for Weakly Supervised Semantic Segmentation
Figure 2 for MECPformer: Multi-estimations Complementary Patch with CNN-Transformers for Weakly Supervised Semantic Segmentation
Figure 3 for MECPformer: Multi-estimations Complementary Patch with CNN-Transformers for Weakly Supervised Semantic Segmentation
Figure 4 for MECPformer: Multi-estimations Complementary Patch with CNN-Transformers for Weakly Supervised Semantic Segmentation

The initial seed based on the convolutional neural network (CNN) for weakly supervised semantic segmentation always highlights the most discriminative regions but fails to identify the global target information. Methods based on transformers have been proposed successively benefiting from the advantage of capturing long-range feature representations. However, we observe a flaw regardless of the gifts based on the transformer. Given a class, the initial seeds generated based on the transformer may invade regions belonging to other classes. Inspired by the mentioned issues, we devise a simple yet effective method with Multi-estimations Complementary Patch (MECP) strategy and Adaptive Conflict Module (ACM), dubbed MECPformer. Given an image, we manipulate it with the MECP strategy at different epochs, and the network mines and deeply fuses the semantic information at different levels. In addition, ACM adaptively removes conflicting pixels and exploits the network self-training capability to mine potential target information. Without bells and whistles, our MECPformer has reached new state-of-the-art 72.0% mIoU on the PASCAL VOC 2012 and 42.4% on MS COCO 2014 dataset. The code is available at https://github.com/ChunmengLiu1/MECPformer.

Viaarxiv icon

Multi-objective Generative Design of Three-Dimensional Composite Materials

Feb 26, 2023
Zhengyang Zhang, Han Fang, Zhao Xu, Jiajie Lv, Yao Shen, Yanming Wang

Figure 1 for Multi-objective Generative Design of Three-Dimensional Composite Materials
Figure 2 for Multi-objective Generative Design of Three-Dimensional Composite Materials
Figure 3 for Multi-objective Generative Design of Three-Dimensional Composite Materials
Figure 4 for Multi-objective Generative Design of Three-Dimensional Composite Materials

Composite materials with 3D architectures are desirable in a variety of applications for the capability of tailoring their properties to meet multiple functional requirements. By the arrangement of materials' internal components, structure design is of great significance in tuning the properties of the composites. However, most of the composite structures are proposed by empirical designs following existing patterns. Hindered by the complexity of 3D structures, it is hard to extract customized structures with multiple desired properties from large design space. Here we report a multi-objective driven Wasserstein generative adversarial network (MDWGAN) to implement inverse designs of 3D composite structures according to given geometrical, structural and mechanical requirements. Our framework consists a GAN based network which generates 3D composite structures possessing with similar geometrical and structural features to the target dataset. Besides, multiple objectives are introduced to our framework for the control of mechanical property and isotropy of the composites. Real time calculation of the properties in training iterations is achieved by an accurate surrogate model. We constructed a small and concise dataset to illustrate our framework. With multiple objectives combined by their weight, and the 3D-GAN act as a soft constraint, our framework is proved to be capable of tuning the properties of the generated composites in multiple aspects, while keeping the selected features of different kinds of structures. The feasibility on small dataset and potential scalability on objectives of other properties make our work a novel, effective approach to provide fast, experience free composite structure designs for various functional materials.

Viaarxiv icon

Precise Single-stage Detector

Oct 09, 2022
Aisha Chandio, Gong Gui, Teerath Kumar, Irfan Ullah, Ramin Ranjbarzadeh, Arunabha M Roy, Akhtar Hussain, Yao Shen

Figure 1 for Precise Single-stage Detector
Figure 2 for Precise Single-stage Detector
Figure 3 for Precise Single-stage Detector
Figure 4 for Precise Single-stage Detector

There are still two problems in SDD causing some inaccurate results: (1) In the process of feature extraction, with the layer-by-layer acquisition of semantic information, local information is gradually lost, resulting into less representative feature maps; (2) During the Non-Maximum Suppression (NMS) algorithm due to inconsistency in classification and regression tasks, the classification confidence and predicted detection position cannot accurately indicate the position of the prediction boxes. Methods: In order to address these aforementioned issues, we propose a new architecture, a modified version of Single Shot Multibox Detector (SSD), named Precise Single Stage Detector (PSSD). Firstly, we improve the features by adding extra layers to SSD. Secondly, we construct a simple and effective feature enhancement module to expand the receptive field step by step for each layer and enhance its local and semantic information. Finally, we design a more efficient loss function to predict the IOU between the prediction boxes and ground truth boxes, and the threshold IOU guides classification training and attenuates the scores, which are used by the NMS algorithm. Main Results: Benefiting from the above optimization, the proposed model PSSD achieves exciting performance in real-time. Specifically, with the hardware of Titan Xp and the input size of 320 pix, PSSD achieves 33.8 mAP at 45 FPS speed on MS COCO benchmark and 81.28 mAP at 66 FPS speed on Pascal VOC 2007 outperforming state-of-the-art object detection models. Besides, the proposed model performs significantly well with larger input size. Under 512 pix, PSSD can obtain 37.2 mAP with 27 FPS on MS COCO and 82.82 mAP with 40 FPS on Pascal VOC 2007. The experiment results prove that the proposed model has a better trade-off between speed and accuracy.

* We will submit it soon to the IEEE transaction. Due to characters limitation, we can not upload the full abstract. Please read the pdf file for more detail 
Viaarxiv icon

Light in the Larynx: a Miniaturized Robotic Optical Fiber for In-office Laser Surgery of the Vocal Folds

Apr 27, 2022
Alex J. Chiluisa, Nicholas E. Pacheco, Hoang S. Do, Ryan M. Tougas, Emily V. Minch, Rositsa Mihaleva, Yao Shen, Yuxiang Liu, Thomas L. Carroll, Loris Fichera

Figure 1 for Light in the Larynx: a Miniaturized Robotic Optical Fiber for In-office Laser Surgery of the Vocal Folds
Figure 2 for Light in the Larynx: a Miniaturized Robotic Optical Fiber for In-office Laser Surgery of the Vocal Folds
Figure 3 for Light in the Larynx: a Miniaturized Robotic Optical Fiber for In-office Laser Surgery of the Vocal Folds
Figure 4 for Light in the Larynx: a Miniaturized Robotic Optical Fiber for In-office Laser Surgery of the Vocal Folds

This letter reports the design, construction, and experimental validation of a novel hand-held robot for in-office laser surgery of the vocal folds. In-office endoscopic laser surgery is an emerging trend in Laryngology: It promises to deliver the same patient outcomes of traditional surgical treatment (i.e., in the operating room), at a fraction of the cost. Unfortunately, office procedures can be challenging to perform; the optical fibers used for laser delivery can only emit light forward in a line-of-sight fashion, which severely limits anatomical access. The robot we present in this letter aims to overcome these challenges. The end effector of the robot is a steerable laser fiber, created through the combination of a thin optical fiber (0.225 mm) with a tendon-actuated Nickel-Titanium notched sheath that provides bending. This device can be seamlessly used with most commercially available endoscopes, as it is sufficiently small (1.1 mm) to pass through a working channel. To control the fiber, we propose a compact actuation unit that can be mounted on top of the endoscope handle, so that, during a procedure, the operating physician can operate both the endoscope and the steerable fiber with a single hand. We report simulation and phantom experiments demonstrating that the proposed device substantially enhances surgical access compared to current clinical fibers.

Viaarxiv icon

GMAIR: Unsupervised Object Detection Based on Spatial Attention and Gaussian Mixture

Jun 03, 2021
Weijin Zhu, Yao Shen, Linfeng Yu, Lizeth Patricia Aguirre Sanchez

Figure 1 for GMAIR: Unsupervised Object Detection Based on Spatial Attention and Gaussian Mixture
Figure 2 for GMAIR: Unsupervised Object Detection Based on Spatial Attention and Gaussian Mixture
Figure 3 for GMAIR: Unsupervised Object Detection Based on Spatial Attention and Gaussian Mixture
Figure 4 for GMAIR: Unsupervised Object Detection Based on Spatial Attention and Gaussian Mixture

Recent studies on unsupervised object detection based on spatial attention have achieved promising results. Models, such as AIR and SPAIR, output "what" and "where" latent variables that represent the attributes and locations of objects in a scene, respectively. Most of the previous studies concentrate on the "where" localization performance; however, we claim that acquiring "what" object attributes is also essential for representation learning. This paper presents a framework, GMAIR, for unsupervised object detection. It incorporates spatial attention and a Gaussian mixture in a unified deep generative model. GMAIR can locate objects in a scene and simultaneously cluster them without supervision. Furthermore, we analyze the "what" latent variables and clustering process. Finally, we evaluate our model on MultiMNIST and Fruit2D datasets and show that GMAIR achieves competitive results on localization and clustering compared to state-of-the-art methods.

* 15 pages, 5 figures 
Viaarxiv icon