Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sunwoo Park

Multi-THuMBS: Multi-person Tracking of 3D Human Meshes Beyond Video Shots

Jul 02, 2026

Jeongwan On, Muhammad Salman Ali, Muneeb A. Khan, Sunwoo Park, Inwoong Moon, Hyung Jin Chang, Jaekwang Kim, Seong Jong Ha, Seungryul Baek

Abstract:Tracking multi-person 3D human meshes from in-the-wild videos is a highly challenging problem due to complex interactions, frequent occlusions, and severe truncation inherent in unconstrained environments. While recent approaches have improved robustness against these issues, they largely overlook the critical challenge prevalent in real-world footage: frequent shot changes. These abrupt transitions in camera viewpoints often cause existing methods to lose track of human identities and fail in reconstructing temporally coherent trajectories. Although several recent works have explored 3D human mesh tracking under shot changes, they are still limited to single-person scenarios, making them inadequate for real-world videos where multiple people interact and appear simultaneously. To address this limitation, we propose Multi-THuMBS (Multi-person Tracking of 3D Human Meshes Beyond Video Shots) that leverages a state-of-the-art 3D scene prior to reconstruct the two boundary frames in a single shared 3D space. Human meshes are then registered within the shared 3D space, maintaining per-person identity and motion consistency across shot changes. Extensive experiments demonstrate that our approach yields significant improvements in 3D human mesh recovery, camera pose estimation, and identity tracking, thereby ensuring high-fidelity motion reconstruction with consistent identity preservation across shots compared to previous state-of-the-art methods.

* Project page: https://on-jungwoan.github.io/projects/multi-thumbs/

Via

Access Paper or Ask Questions

CRePE: Curved Ray Expectation Positional Encoding for Unified-Camera-Controlled Video Generation

May 13, 2026

Seonghyun Jin, Youngmin Kim, Sunwoo Park, Jong Chul Ye

Abstract:Camera-conditioned video generation requires positional encoding that remains reliable under changes in camera motion, lens configuration, and scene structure. However, existing attention-level camera encodings either provide ray-only camera signals or rely on pinhole camera geometry, limiting their applicability to general camera control under the Unified Camera Model, including wide-angle and fisheye lenses. To address this limitation, we propose Curved Ray Expectation Positional Encoding (CRePE). CRePE represents each image token as a depth-aware positional distribution along its source ray, providing a Unified Camera Model-compatible positional encoding that captures the projected-path geometry induced by wide-angle and fisheye cameras. CRePE is implemented through a Geometric Attention Adapter added to frozen video DiTs, injecting token-wise scene-distance information into selected attention layers and stabilizing it with pseudo supervision from a monocular geometry foundation model. This design leads to more stable camera control and improves several geometry-aware and perceptual-quality metrics, while remaining competitive on video-quality metrics. Controlled positional-encoding ablations show a better overall average rank than a RayRoPE-style endpoint PE baseline, demonstrating the effectiveness of UCM-aware projected-path integration across diverse camera models. Furthermore, by extending the same positional-encoding pathway to external geometry control through Radial MixForcing, CRePE supports external radial-map control for scene-geometry-conditioned generation and source-video motion transfer beyond camera control.

* 17 pages, 8 figures, Under review

Via

Access Paper or Ask Questions

Geometric 4D Stitching for Grounded 4D Generation

May 11, 2026

Sunwoo Park, Taesung Kwon, Jong Chul Ye

Abstract:Recent 4D generation methods complete scene-level missing information using generative models and reconstruct the scene into radiance-based representations. However, these pipelines often present geometric inconsistencies in the generated content, and the radiance-based reconstruction requires expensive optimization. Furthermore, radiance-based representations often absorb these geometric inconsistencies into their view-dependent nature, failing to enforce the grounded geometric consistency. To address these issues, we propose Geometric 4D Stitching, an efficient framework that explicitly identifies missing geometric regions and complements them with geometrically grounded 4D stitches. As a result, our method constructs 4D scene representations in under 10 minutes on a single NVIDIA RTX 5090 GPU per one-step scene expansion, while improving geometric consistency. Moreover, we demonstrate that our explicit 4D stitching supports interative expansion of 4D mesh as well as 4D scene editing.

Via

Access Paper or Ask Questions

Evolution of UE in Massive MIMO Systems for 6G: From Passive to Active

Jan 01, 2026

Kwonyeol Park, Hyuckjin Choi, Geonho Han, Gyoseung Lee, Yeonjoon Choi, Sunwoo Park, Junil Choi

Abstract:As wireless networks continue to evolve, stringent latency and reliability requirements and highly dynamic channels expose fundamental limitations of gNB-centric massive multiple-input multiple-output (mMIMO) architectures, motivating a rethinking of the user equipment (UE) role. In response, the UE is transitioning from a passive transceiver into an active entity that directly contributes to system-level performance. In this context, this article examines the evolving role of the UE in mMIMO systems during the transition from fifth-generation (5G) to sixth-generation (6G), bridging third generation partnership project (3GPP) standardization, device implementation, and architectural innovation. Through a chronological review of 3GPP Releases 15 to 19, we highlight the progression of UE functionalities from basic channel state information (CSI) reporting to artificial intelligence (AI) and machine learning (ML)-based CSI enhancement and UE-initiated beam management. We further examine key implementation challenges, including multi-panel UE (MPUE) architectures, on-device intelligent processing, and energy-efficient operation, and then discuss corresponding architectural innovations under practical constraints. Using digital-twin-based evaluations, we validate the impact of emerging UE-centric functionalities, illustrating that UE-initiated beam reporting improves throughput in realistic mobility scenarios, while a multi-panel architecture enhances link robustness compared with a single-panel UE.

* 7 pages, 4 figures

Via

Access Paper or Ask Questions