Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ligang Liu

Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vectorized Drawings

Aug 26, 2025

Feiwei Qin, Shichao Lu, Junhao Hou, Changmiao Wang, Meie Fang, Ligang Liu

Figure 1 for Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vectorized Drawings

Figure 2 for Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vectorized Drawings

Figure 3 for Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vectorized Drawings

Figure 4 for Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vectorized Drawings

Abstract:Computer-Aided Design (CAD) generative modeling is driving significant innovations across industrial applications. Recent works have shown remarkable progress in creating solid models from various inputs such as point clouds, meshes, and text descriptions. However, these methods fundamentally diverge from traditional industrial workflows that begin with 2D engineering drawings. The automatic generation of parametric CAD models from these 2D vector drawings remains underexplored despite being a critical step in engineering design. To address this gap, our key insight is to reframe CAD generation as a sequence-to-sequence learning problem where vector drawing primitives directly inform the generation of parametric CAD operations, preserving geometric precision and design intent throughout the transformation process. We propose Drawing2CAD, a framework with three key technical components: a network-friendly vector primitive representation that preserves precise geometric information, a dual-decoder transformer architecture that decouples command type and parameter generation while maintaining precise correspondence, and a soft target distribution loss function accommodating inherent flexibility in CAD parameters. To train and evaluate Drawing2CAD, we create CAD-VGDrawing, a dataset of paired engineering drawings and parametric CAD models, and conduct thorough experiments to demonstrate the effectiveness of our method. Code and dataset are available at https://github.com/lllssc/Drawing2CAD.

* Accepted to ACM MM 2025

Via

Access Paper or Ask Questions

AvatarMakeup: Realistic Makeup Transfer for 3D Animatable Head Avatars

Jul 03, 2025

Yiming Zhong, Xiaolin Zhang, Ligang Liu, Yao Zhao, Yunchao Wei

Abstract:Similar to facial beautification in real life, 3D virtual avatars require personalized customization to enhance their visual appeal, yet this area remains insufficiently explored. Although current 3D Gaussian editing methods can be adapted for facial makeup purposes, these methods fail to meet the fundamental requirements for achieving realistic makeup effects: 1) ensuring a consistent appearance during drivable expressions, 2) preserving the identity throughout the makeup process, and 3) enabling precise control over fine details. To address these, we propose a specialized 3D makeup method named AvatarMakeup, leveraging a pretrained diffusion model to transfer makeup patterns from a single reference photo of any individual. We adopt a coarse-to-fine idea to first maintain the consistent appearance and identity, and then to refine the details. In particular, the diffusion model is employed to generate makeup images as supervision. Due to the uncertainties in diffusion process, the generated images are inconsistent across different viewpoints and expressions. Therefore, we propose a Coherent Duplication method to coarsely apply makeup to the target while ensuring consistency across dynamic and multiview effects. Coherent Duplication optimizes a global UV map by recoding the averaged facial attributes among the generated makeup images. By querying the global UV map, it easily synthesizes coherent makeup guidance from arbitrary views and expressions to optimize the target avatar. Given the coarse makeup avatar, we further enhance the makeup by incorporating a Refinement Module into the diffusion model to achieve high makeup quality. Experiments demonstrate that AvatarMakeup achieves state-of-the-art makeup transfer quality and consistency throughout animation.

Via

Access Paper or Ask Questions

SUPRA: Subspace Parameterized Attention for Neural Operator on General Domains

Apr 22, 2025

Zherui Yang, Zhengyang Xue, Ligang Liu

Abstract:Neural operators are efficient surrogate models for solving partial differential equations (PDEs), but their key components face challenges: (1) in order to improve accuracy, attention mechanisms suffer from computational inefficiency on large-scale meshes, and (2) spectral convolutions rely on the Fast Fourier Transform (FFT) on regular grids and assume a flat geometry, which causes accuracy degradation on irregular domains. To tackle these problems, we regard the matrix-vector operations in the standard attention mechanism on vectors in Euclidean space as bilinear forms and linear operators in vector spaces and generalize the attention mechanism to function spaces. This new attention mechanism is fully equivalent to the standard attention but impossible to compute due to the infinite dimensionality of function spaces. To address this, inspired by model reduction techniques, we propose a Subspace Parameterized Attention (SUPRA) neural operator, which approximates the attention mechanism within a finite-dimensional subspace. To construct a subspace on irregular domains for SUPRA, we propose using the Laplacian eigenfunctions, which naturally adapt to domains' geometry and guarantee the optimal approximation for smooth functions. Experiments show that the SUPRA neural operator reduces error rates by up to 33% on various PDE datasets while maintaining state-of-the-art computational efficiency.

Via

Access Paper or Ask Questions

MASH: Masked Anchored SpHerical Distances for 3D Shape Representation and Generation

Apr 12, 2025

Changhao Li, Yu Xin, Xiaowei Zhou, Ariel Shamir, Hao Zhang, Ligang Liu, Ruizhen Hu

Abstract:We introduce Masked Anchored SpHerical Distances (MASH), a novel multi-view and parametrized representation of 3D shapes. Inspired by multi-view geometry and motivated by the importance of perceptual shape understanding for learning 3D shapes, MASH represents a 3D shape as a collection of observable local surface patches, each defined by a spherical distance function emanating from an anchor point. We further leverage the compactness of spherical harmonics to encode the MASH functions, combined with a generalized view cone with a parameterized base that masks the spatial extent of the spherical function to attain locality. We develop a differentiable optimization algorithm capable of converting any point cloud into a MASH representation accurately approximating ground-truth surfaces with arbitrary geometry and topology. Extensive experiments demonstrate that MASH is versatile for multiple applications including surface reconstruction, shape generation, completion, and blending, achieving superior performance thanks to its unique representation encompassing both implicit and explicit features.

* 11 pages, 11 figures, SIGGRAPH 2025 Accept - Conference

Via

Access Paper or Ask Questions

ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting

Mar 11, 2025

Junfu Guo, Yu Xin, Gaoyi Liu, Kai Xu, Ligang Liu, Ruizhen Hu

Figure 1 for ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting

Figure 2 for ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting

Figure 3 for ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting

Figure 4 for ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting

Abstract:We tackle the challenge of concurrent reconstruction at the part level with the RGB appearance and estimation of motion parameters for building digital twins of articulated objects using the 3D Gaussian Splatting (3D-GS) method. With two distinct sets of multi-view imagery, each depicting an object in separate static articulation configurations, we reconstruct the articulated object in 3D Gaussian representations with both appearance and geometry information at the same time. Our approach decoupled multiple highly interdependent parameters through a multi-step optimization process, thereby achieving a stable optimization procedure and high-quality outcomes. We introduce ArticulatedGS, a self-supervised, comprehensive framework that autonomously learns to model shapes and appearances at the part level and synchronizes the optimization of motion parameters, all without reliance on 3D supervision, motion cues, or semantic labels. Our experimental results demonstrate that, among comparable methodologies, our approach has achieved optimal outcomes in terms of part segmentation accuracy, motion estimation accuracy, and visual quality.

Via

Access Paper or Ask Questions

No Parameters, No Problem: 3D Gaussian Splatting without Camera Intrinsics and Extrinsics

Feb 27, 2025

Dongbo Shi, Shen Cao, Lubin Fan, Bojian Wu, Jinhui Guo, Renjie Chen, Ligang Liu, Jieping Ye

Figure 1 for No Parameters, No Problem: 3D Gaussian Splatting without Camera Intrinsics and Extrinsics

Figure 2 for No Parameters, No Problem: 3D Gaussian Splatting without Camera Intrinsics and Extrinsics

Figure 3 for No Parameters, No Problem: 3D Gaussian Splatting without Camera Intrinsics and Extrinsics

Figure 4 for No Parameters, No Problem: 3D Gaussian Splatting without Camera Intrinsics and Extrinsics

Abstract:While 3D Gaussian Splatting (3DGS) has made significant progress in scene reconstruction and novel view synthesis, it still heavily relies on accurately pre-computed camera intrinsics and extrinsics, such as focal length and camera poses. In order to mitigate this dependency, the previous efforts have focused on optimizing 3DGS without the need for camera poses, yet camera intrinsics remain necessary. To further loose the requirement, we propose a joint optimization method to train 3DGS from an image collection without requiring either camera intrinsics or extrinsics. To achieve this goal, we introduce several key improvements during the joint training of 3DGS. We theoretically derive the gradient of the camera intrinsics, allowing the camera intrinsics to be optimized simultaneously during training. Moreover, we integrate global track information and select the Gaussian kernels associated with each track, which will be trained and automatically rescaled to an infinitesimally small size, closely approximating surface points, and focusing on enforcing multi-view consistency and minimizing reprojection errors, while the remaining kernels continue to serve their original roles. This hybrid training strategy nicely unifies the camera parameters estimation and 3DGS training. Extensive evaluations demonstrate that the proposed method achieves state-of-the-art (SOTA) performance on both public and synthetic datasets.

Via

Access Paper or Ask Questions

Text2VDM: Text to Vector Displacement Maps for Expressive and Interactive 3D Sculpting

Feb 27, 2025

Hengyu Meng, Duotun Wang, Zhijing Shao, Ligang Liu, Zeyu Wang

Abstract:Professional 3D asset creation often requires diverse sculpting brushes to add surface details and geometric structures. Despite recent progress in 3D generation, producing reusable sculpting brushes compatible with artists' workflows remains an open and challenging problem. These sculpting brushes are typically represented as vector displacement maps (VDMs), which existing models cannot easily generate compared to natural images. This paper presents Text2VDM, a novel framework for text-to-VDM brush generation through the deformation of a dense planar mesh guided by score distillation sampling (SDS). The original SDS loss is designed for generating full objects and struggles with generating desirable sub-object structures from scratch in brush generation. We refer to this issue as semantic coupling, which we address by introducing classifier-free guidance (CFG) weighted blending of prompt tokens to SDS, resulting in a more accurate target distribution and semantic guidance. Experiments demonstrate that Text2VDM can generate diverse, high-quality VDM brushes for sculpting surface details and geometric structures. Our generated brushes can be seamlessly integrated into mainstream modeling software, enabling various applications such as mesh stylization and real-time interactive modeling.

* 11 pages, 11 figures

Via

Access Paper or Ask Questions

HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting

Dec 05, 2024

Jingyu Lin, Jiaqi Gu, Lubin Fan, Bojian Wu, Yujing Lou, Renjie Chen, Ligang Liu, Jieping Ye

Figure 1 for HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting

Figure 2 for HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting

Figure 3 for HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting

Figure 4 for HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting

Abstract:Generating high-quality novel view renderings of 3D Gaussian Splatting (3DGS) in scenes featuring transient objects is challenging. We propose a novel hybrid representation, termed as HybridGS, using 2D Gaussians for transient objects per image and maintaining traditional 3D Gaussians for the whole static scenes. Note that, the 3DGS itself is better suited for modeling static scenes that assume multi-view consistency, but the transient objects appear occasionally and do not adhere to the assumption, thus we model them as planar objects from a single view, represented with 2D Gaussians. Our novel representation decomposes the scene from the perspective of fundamental viewpoint consistency, making it more reasonable. Additionally, we present a novel multi-view regulated supervision method for 3DGS that leverages information from co-visible regions, further enhancing the distinctions between the transients and statics. Then, we propose a straightforward yet effective multi-stage training strategy to ensure robust training and high-quality view synthesis across various settings. Experiments on benchmark datasets show our state-of-the-art performance of novel view synthesis in both indoor and outdoor scenes, even in the presence of distracting elements.

* Project page: https://gujiaqivadin.github.io/hybridgs/

Via

Access Paper or Ask Questions

Learning Neural Volumetric Pose Features for Camera Localization

Mar 19, 2024

Jingyu Lin, Jiaqi Gu, Bojian Wu, Lubin Fan, Renjie Chen, Ligang Liu, Jieping Ye

Figure 1 for Learning Neural Volumetric Pose Features for Camera Localization

Figure 2 for Learning Neural Volumetric Pose Features for Camera Localization

Figure 3 for Learning Neural Volumetric Pose Features for Camera Localization

Figure 4 for Learning Neural Volumetric Pose Features for Camera Localization

Abstract:We introduce a novel neural volumetric pose feature, termed PoseMap, designed to enhance camera localization by encapsulating the information between images and the associated camera poses. Our framework leverages an Absolute Pose Regression (APR) architecture, together with an augmented NeRF module. This integration not only facilitates the generation of novel views to enrich the training dataset but also enables the learning of effective pose features. Additionally, we extend our architecture for self-supervised online alignment, allowing our method to be used and fine-tuned for unlabelled images within a unified framework. Experiments demonstrate that our method achieves 14.28% and 20.51% performance gain on average in indoor and outdoor benchmark scenes, outperforming existing APR methods with state-of-the-art accuracy.

* 14 pages, 9 figures

Via

Access Paper or Ask Questions

FrameNeRF: A Simple and Efficient Framework for Few-shot Novel View Synthesis

Feb 26, 2024

Yan Xing, Pan Wang, Ligang Liu, Daolun Li, Li Zhang

Abstract:We present a novel framework, called FrameNeRF, designed to apply off-the-shelf fast high-fidelity NeRF models with fast training speed and high rendering quality for few-shot novel view synthesis tasks. The training stability of fast high-fidelity models is typically constrained to dense views, making them unsuitable for few-shot novel view synthesis tasks. To address this limitation, we utilize a regularization model as a data generator to produce dense views from sparse inputs, facilitating subsequent training of fast high-fidelity models. Since these dense views are pseudo ground truth generated by the regularization model, original sparse images are then used to fine-tune the fast high-fidelity model. This process helps the model learn realistic details and correct artifacts introduced in earlier stages. By leveraging an off-the-shelf regularization model and a fast high-fidelity model, our approach achieves state-of-the-art performance across various benchmark datasets.

Via

Access Paper or Ask Questions