Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiangtao Wang

FlowCoMotion: Text-to-Motion Generation via Token-Latent Flow Modeling

Apr 13, 2026

Dawei Guan, Di Yang, Chengjie Jin, Jiangtao Wang

Abstract:Text-to-motion generation is driven by learning motion representations for semantic alignment with language. Existing methods rely on either continuous or discrete motion representations. However, continuous representations entangle semantics with dynamics, while discrete representations lose fine-grained motion details. In this context, we propose FlowCoMotion, a novel motion generation framework that unifies both treatments from a modeling perspective. Specifically, FlowCoMotion employs token-latent coupling to capture both semantic content and high-fidelity motion details. In the latent branch, we apply multi-view distillation to regularize the continuous latent space, while in the token branch we use discrete temporal resolution quantization to extract high-level semantic cues. The motion latent is then obtained by combining the representations from the two branches through a token-latent coupling network. Subsequently, a velocity field is predicted based on the textual conditions. An ODE solver integrates this velocity field from a simple prior, thereby guiding the sample to the potential state of the target motion. Extensive experiments show that FlowCoMotion achieves competitive performance on text-to-motion benchmarks, including HumanML3D and SnapMoGen.

* 23 pages, 14 figures

Via

Access Paper or Ask Questions

Virtual Biopsy for Intracranial Tumors Diagnosis on MRI

Feb 25, 2026

Xinzhe Luo, Shuai Shao, Yan Wang, Jiangtao Wang, Yutong Bai, Jianguo Zhang

Abstract:Deep intracranial tumors situated in eloquent brain regions controlling vital functions present critical diagnostic challenges. Clinical practice has shifted toward stereotactic biopsy for pathological confirmation before treatment. Yet biopsy carries inherent risks of hemorrhage and neurological deficits and struggles with sampling bias due to tumor spatial heterogeneity, because pathological changes are typically region-selective rather than tumor-wide. Therefore, advancing non-invasive MRI-based pathology prediction is essential for holistic tumor assessment and modern clinical decision-making. The primary challenge lies in data scarcity: low tumor incidence requires long collection cycles, and annotation demands biopsy-verified pathology from neurosurgical experts. Additionally, tiny lesion volumes lacking segmentation masks cause critical features to be overwhelmed by background noise. To address these challenges, we construct the ICT-MRI dataset - the first public biopsy-verified benchmark with 249 cases across four categories. We propose a Virtual Biopsy framework comprising: MRI-Processor for standardization; Tumor-Localizer employing vision-language models for coarse-to-fine localization via weak supervision; and Adaptive-Diagnoser with a Masked Channel Attention mechanism fusing local discriminative features with global contexts. Experiments demonstrate over 90% accuracy, outperforming baselines by more than 20%.

Via

Access Paper or Ask Questions

ReBA-Pred-Net: Weakly-Supervised Regional Brain Age Prediction on MRI

Feb 13, 2026

Shuai Shao, Yan Wang, Shu Jiang, Shiyuan Zhao, Xinzhe Luo, Di Yang, Jiangtao Wang, Yutong Bai, Jianguo Zhang

Abstract:Brain age has become a prominent biomarker of brain health. Yet most prior work targets whole brain age (WBA), a coarse paradigm that struggles to support tasks such as disease characterization and research on development and aging patterns, because relevant changes are typically region-selective rather than brain-wide. Therefore, robust regional brain age (ReBA) estimation is critical, yet a widely generalizable model has yet to be established. In this paper, we propose the Regional Brain Age Prediction Network (ReBA-Pred-Net), a Teacher-Student framework designed for fine-grained brain age estimation. The Teacher produces soft ReBA to guide the Student to yield reliable ReBA estimates with a clinical-prior consistency constraint (regions within the same function should change similarly). For rigorous evaluation, we introduce two indirect metrics: Healthy Control Similarity (HCS), which assesses statistical consistency by testing whether regional brain-age-gap (ReBA minus chronological age) distributions align between training and unseen HC; and Neuro Disease Correlation (NDC), which assesses factual consistency by checking whether clinically confirmed patients show elevated brain-age-gap in disease-associated regions. Experiments across multiple backbones demonstrate the statistical and factual validity of our method.

Via

Access Paper or Ask Questions

FFT-Free PAPR Reduction Methods for OFDM Signals

Sep 17, 2025

Hao Su, Jiangtao Wang, Yongchao Wang

Figure 1 for FFT-Free PAPR Reduction Methods for OFDM Signals

Figure 2 for FFT-Free PAPR Reduction Methods for OFDM Signals

Figure 3 for FFT-Free PAPR Reduction Methods for OFDM Signals

Figure 4 for FFT-Free PAPR Reduction Methods for OFDM Signals

Abstract:In this paper, we propose two low-complexity peak to average power ratio(PAPR) reduction algorithms for orthogonal frequency division multiplexing(OFDM) signals. The main content is as follows: First, a non-convex optimization model is established by minimizing the signal distortion power. Then, a customized alternating direction method of multipliers(ADMM) algorithm is proposed to solve the problem, named time domain ADMM(T-ADMM) along with an improved version called T-ADMM with constrain update(TCU-ADMM). In the algorithms, all subproblems can be solved analytically, and each iteration has linear computational complexity. These algorithms circumvents the challenges posed by repeated fast Fourier transform(FFT) and inverse FFT(IFFT) operations in traditional PAPR reduction algorithms. Additionally, we prove that the T-ADMM algorithm is theoretically guaranteed convergent if proper parameter is chosen. Finally, simulation results demonstrate the effectiveness of the proposed methods.

* 6 page, 7 figures

Via

Access Paper or Ask Questions

Unimodular Waveform Design for Integrated Sensing and Communication MIMO System via Manifold Optimization

Apr 08, 2025

Jiangtao Wang, Xuyang Zhao, Muyu Mei, Yongchao Wang

Figure 1 for Unimodular Waveform Design for Integrated Sensing and Communication MIMO System via Manifold Optimization

Figure 2 for Unimodular Waveform Design for Integrated Sensing and Communication MIMO System via Manifold Optimization

Figure 3 for Unimodular Waveform Design for Integrated Sensing and Communication MIMO System via Manifold Optimization

Figure 4 for Unimodular Waveform Design for Integrated Sensing and Communication MIMO System via Manifold Optimization

Abstract:Integrated sensing and communication (ISAC) has been widely recognized as one of the key technologies for 6G wireless networks. In this paper, we focus on the waveform design of ISAC system, which can realize radar sensing while also facilitate information transmission. The main content is as follows: first, we formulate the waveform design problem as a nonconvex and non-smooth model with a unimodulus constraint based on the measurement metric of the radar and communication system. Second, we transform the model into an unconstrained problem on the Riemannian manifold and construct the corresponding operators by analyzing the unimodulus constraint. Third, to achieve the solution efficiently, we propose a low-complexity non-smooth unimodulus manifold gradient descent (N-UMGD) algorithm with theoretical convergence guarantee. The simulation results show that the proposed algorithm can concentrate the energy of the sensing signal in the desired direction and realize information transmission with a low bit error rate.

Via

Access Paper or Ask Questions

Scaling Image Tokenizers with Grouped Spherical Quantization

Dec 03, 2024

Jiangtao Wang, Zhen Qin, Yifan Zhang, Vincent Tao Hu, Björn Ommer, Rania Briq, Stefan Kesselheim

Figure 1 for Scaling Image Tokenizers with Grouped Spherical Quantization

Figure 2 for Scaling Image Tokenizers with Grouped Spherical Quantization

Figure 3 for Scaling Image Tokenizers with Grouped Spherical Quantization

Figure 4 for Scaling Image Tokenizers with Grouped Spherical Quantization

Abstract:Vision tokenizers have gained a lot of attraction due to their scalability and compactness; previous works depend on old-school GAN-based hyperparameters, biased comparisons, and a lack of comprehensive analysis of the scaling behaviours. To tackle those issues, we introduce Grouped Spherical Quantization (GSQ), featuring spherical codebook initialization and lookup regularization to constrain codebook latent to a spherical surface. Our empirical analysis of image tokenizer training strategies demonstrates that GSQ-GAN achieves superior reconstruction quality over state-of-the-art methods with fewer training iterations, providing a solid foundation for scaling studies. Building on this, we systematically examine the scaling behaviours of GSQ, specifically in latent dimensionality, codebook size, and compression ratios, and their impact on model performance. Our findings reveal distinct behaviours at high and low spatial compression levels, underscoring challenges in representing high-dimensional latent spaces. We show that GSQ can restructure high-dimensional latent into compact, low-dimensional spaces, thus enabling efficient scaling with improved quality. As a result, GSQ-GAN achieves a 16x down-sampling with a reconstruction FID (rFID) of 0.50.

Via

Access Paper or Ask Questions

Data Pruning in Generative Diffusion Models

Nov 19, 2024

Rania Briq, Jiangtao Wang, Steffan Kesselheim

Figure 1 for Data Pruning in Generative Diffusion Models

Figure 2 for Data Pruning in Generative Diffusion Models

Figure 3 for Data Pruning in Generative Diffusion Models

Figure 4 for Data Pruning in Generative Diffusion Models

Abstract:Data pruning is the problem of identifying a core subset that is most beneficial to training and discarding the remainder. While pruning strategies are well studied for discriminative models like those used in classification, little research has gone into their application to generative models. Generative models aim to estimate the underlying distribution of the data, so presumably they should benefit from larger datasets. In this work we aim to shed light on the accuracy of this statement, specifically answer the question of whether data pruning for generative diffusion models could have a positive impact. Contrary to intuition, we show that eliminating redundant or noisy data in large datasets is beneficial particularly when done strategically. We experiment with several pruning methods including recent-state-of-art methods, and evaluate over CelebA-HQ and ImageNet datasets. We demonstrate that a simple clustering method outperforms other sophisticated and computationally demanding methods. We further exhibit how we can leverage clustering to balance skewed datasets in an unsupervised manner to allow fair sampling for underrepresented populations in the data distribution, which is a crucial problem in generative models.

Via

Access Paper or Ask Questions

Designing Unimodular Waveforms with Good Correlation Properties for Large-Scale MIMO Radar via Manifold Optimization Method

Oct 10, 2024

Xuyang Zhao, Jiangtao Wang, Yongchao Wang

Figure 1 for Designing Unimodular Waveforms with Good Correlation Properties for Large-Scale MIMO Radar via Manifold Optimization Method

Abstract:In this paper, we design constant modulus probing waveforms with good correlation properties for large-scale collocated multi-input multi-output (MIMO) radar systems. The main content is as follows: First, we formulate the design problem as a fourth-order polynomial minimization problem with unimodulus constraints. Then, by analyzing the geometric properties of the unimodulus constraints through Riemannian geometry theory and embedding them into the search space, we transform the original non-convex optimization problem into an unconstrained problem on a Riemannian manifold for solution. Second, we convert the objective function into the form of a large but finite number of loss functions and employ a customized R-SVRG algorithm to solve it. Third, we prove that the customized R-SVRG algorithm is theoretically guaranteed to converge if appropriate parameters are chosen. Numerical examples demonstrate the effectiveness of the proposed R-SVRG algorithm.

Via

Access Paper or Ask Questions

Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit

Oct 08, 2024

Oleg Filatov, Jan Ebert, Jiangtao Wang, Stefan Kesselheim

Figure 1 for Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit

Figure 2 for Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit

Figure 3 for Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit

Figure 4 for Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit

Abstract:One of the main challenges in optimal scaling of large language models (LLMs) is the prohibitive cost of hyperparameter tuning, particularly learning rate $\eta$ and batch size $B$. While techniques like $\mu$P (Yang et al., 2022) provide scaling rules for optimal $\eta$ transfer in the infinite model size limit, the optimal scaling behavior in the infinite data size limit ($T \to \infty$) remains unknown. We fill in this gap by observing for the first time an interplay of three optimal $\eta$ scaling regimes: $\eta \propto \sqrt{T}$, $\eta \propto 1$, and $\eta \propto 1/\sqrt{T}$ with transitions controlled by $B$ and its relation to the time-evolving critical batch size $B_\mathrm{crit} \propto T$. Furthermore, we show that the optimal batch size is positively correlated with $B_\mathrm{crit}$: keeping it fixed becomes suboptimal over time even if learning rate is scaled optimally. Surprisingly, our results demonstrate that the observed optimal $\eta$ and $B$ dynamics are preserved with $\mu$P model scaling, challenging the conventional view of $B_\mathrm{crit}$ dependence solely on loss value. Complementing optimality, we examine the sensitivity of loss to changes in learning rate, where we find the sensitivity to decrease with $T \to \infty$ and to remain constant with $\mu$P model scaling. We hope our results make the first step towards a unified picture of the joint optimal data and model scaling.

Via

Access Paper or Ask Questions

Predict and Interpret Health Risk using EHR through Typical Patients

Dec 18, 2023

Zhihao Yu, Chaohe Zhang, Yasha Wang, Wen Tang, Jiangtao Wang, Liantao Ma

Abstract:Predicting health risks from electronic health records (EHR) is a topic of recent interest. Deep learning models have achieved success by modeling temporal and feature interaction. However, these methods learn insufficient representations and lead to poor performance when it comes to patients with few visits or sparse records. Inspired by the fact that doctors may compare the patient with typical patients and make decisions from similar cases, we propose a Progressive Prototypical Network (PPN) to select typical patients as prototypes and utilize their information to enhance the representation of the given patient. In particular, a progressive prototype memory and two prototype separation losses are proposed to update prototypes. Besides, a novel integration is introduced for better fusing information from patients and prototypes. Experiments on three real-world datasets demonstrate that our model brings improvement on all metrics. To make our results better understood by physicians, we developed an application at http://ppn.ai-care.top. Our code is released at https://github.com/yzhHoward/PPN.

Via

Access Paper or Ask Questions