Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zixin Wang

Infrastructure First: Enabling Embodied AI for Science in the Global South

Apr 08, 2026

Shaoshan Liu, Jie Tang, Marwa S. Hassan, Mohamed H. Sharkawy, Moustafa M. G. Fouda, Tiewei Shang, Zixin Wang

Abstract:Embodied AI for Science (EAI4S) brings intelligence into the laboratory by uniting perception, reasoning, and robotic action to autonomously run experiments in the physical world. For the Global South, this shift is not about adopting advanced automation for its own sake, but about overcoming a fundamental capacity constraint: too few hands to run too many experiments. By enabling continuous, reliable experimentation under limits of manpower, power, and connectivity, EAI4S turns automation from a luxury into essential scientific infrastructure. The main obstacle, however, is not algorithmic capability. It is infrastructure. Open-source AI and foundation models have narrowed the knowledge gap, but EAI4S depends on dependable edge compute, energy-efficient hardware, modular robotic systems, localized data pipelines, and open standards. Without these foundations, even the most capable models remain trapped in well-resourced laboratories. This article argues for an infrastructure-first approach to EAI4S and outlines the practical requirements for deploying embodied intelligence at scale, offering a concrete pathway for Global South institutions to translate AI advances into sustained scientific capacity and competitive research output.

Via

Access Paper or Ask Questions

A Learning-Based Cooperative Coevolution Framework for Heterogeneous Large-Scale Global Optimization

Mar 30, 2026

Wenjie Qiu, Zixin Wang, Hongyu Fang, Zeyuan Ma, Yue-Jiao Gong

Abstract:Cooperative Coevolution (CC) effectively addresses Large-Scale Global Optimization (LSGO) via decomposition but struggles with the emerging class of Heterogeneous LSGO (H-LSGO) problems arising from real-world applications, where subproblems exhibit diverse dimensions and distinct landscapes. The prevailing CC paradigm, relying on a fixed low-dimensional optimizer, often fails to navigate this heterogeneity. To address this limitation, we propose the Learning-Based Heterogeneous Cooperative Coevolution Framework (LH-CC). By formulating the optimization process as a Markov Decision Process, LH-CC employs a meta-agent to adaptively select the most suitable optimizer for each subproblem. We also introduce a flexible benchmark suite to generate diverse H-LSGO problem instances. Extensive experiments on 3000-dimensional problems with complex coupling relationships demonstrate that LH-CC achieves superior solution quality and computational efficiency compared to state-of-the-art baselines. Furthermore, the framework exhibits robust generalization across varying problem instances, optimization horizons, and optimizers. Our findings reveal that dynamic optimizer selection is a pivotal strategy for solving complex H-LSGO problems.

* 13 pages, 5 figures, 3 tables. Accepted for publication in GECCO 2026

Via

Access Paper or Ask Questions

Service Function Chain Routing in LEO Networks Using Shortest-Path Delay Statistical Stability

Mar 04, 2026

Li Zeng, Zixin Wang, Yuanming Shi, Khaled B. Letaief

Abstract:Low Earth orbit (LEO) satellite constellations have become a critical enabler for global coverage, utilizing numerous satellites orbiting Earth at high speeds. By decomposing complex network services into lightweight service functions, network function virtualization (NFV) transforms global network services into diverse service function chains (SFCs), coordinated by resource-constrained LEOs. However, the dynamic topology of satellite networks, marked by highly variable inter-satellite link delays, poses significant challenges for designing efficient routing strategies that ensure reliable and low-latency communication. Many existing routing methods suffer from poor scalability and degraded performance, limiting their practical implementation. To address these challenges, this paper proposes a novel SFC routing approach that leverages the statistical properties of network link states to mitigate instability caused by instantaneous modeling in dynamic satellite networks. Through comprehensive simulations on end-to-end shortest-path propagation delays in LEO networks, we identify and validate the statistical stability of multi-hop routes. Building on this insight, we introduce the Stability-Aware Multi-Stage Graph Routing (SA-MSGR) algorithm, which incorporates pre-computed average delays into a multi-stage graph optimization framework. Extensive simulations demonstrate the superior performance of SA-MSGR, achieving significantly lower and more predictable end-to-end SFC delays compared to representative baseline strategies.

Via

Access Paper or Ask Questions

Unseen Cost of Space Computing: Quantifying LEO Battery Aging via Physics-Driven Modeling

Mar 04, 2026

Li Zeng, Jingyang Zhu, Zixin Wang, Yuanming Shi, Khaled B. Letaief

Abstract:Low Earth Orbit (LEO) satellite constellations in the 6G era are evolving into intelligent in-orbit computational platforms, forming Space Computing Power Networks (SCPNs) to deliver global-scale computing services. However, the intensive computation within SCPN incurs a significant ``unseen cost'': the frequent charge-discharge cycles accelerate the physical degradation of satellites' life-limiting and high-cost batteries, thereby threatening the long-term operational viability of such a system. Existing approaches, often relying on indirect metrics like Depth of Discharge (DoD) and neglecting the complex, nonlinear degradation process of battery aging, fail to accurately quantify this cost. To address this, we introduce a high-fidelity, physics-driven model that quantitatively links computational workload parameters to the nonlinear battery degradation. Building on this model, we formulate a degradation-aware scheduling problem and analyze heuristic policies across different energy regimes. Simulations reveal that the optimal strategy should be adaptive: in solar-rich conditions, a myopic policy maximizing instantaneous solar utilization is superior, whereas under energy scarcity, a reactive policy leveraging real-time battery state significantly extends lifetime.

Via

Access Paper or Ask Questions

Bridging simulation and reality in subsurface radar-based sensing: physics-guided hierarchical domain adaptation with deep adversarial learning

Dec 19, 2025

Zixin Wang, Ishfaq Aziz, Mohamad Alipour

Abstract:Accurate estimation of subsurface material properties, such as soil moisture, is critical for wildfire risk assessment and precision agriculture. Ground-penetrating radar (GPR) is a non-destructive geophysical technique widely used to characterize subsurface conditions. Data-driven parameter estimation methods typically require large amounts of labeled training data, which is expensive to obtain from real-world GPR scans under diverse subsurface conditions. A physics-based GPR model using the finite-difference time-domain (FDTD) method can be employed to generate large synthetic datasets through simulations across varying material parameters, which are then utilized to train data-driven models. A key limitation, however, is that simulated data (source domain) and real-world data (target domain) often follow different distributions, which can cause data-driven models trained on simulations to underperform in real-world scenarios. To address this challenge, this study proposes a novel physics-guided hierarchical domain adaptation framework with deep adversarial learning for robust subsurface material property estimation from GPR signals. The proposed framework is systematically evaluated through the laboratory tests for single- and two-layer materials, as well as the field tests for single- and two-layer materials, and is benchmarked against state-of-the-art methods, including the one-dimensional convolutional neural network (1D CNN) and domain adversarial neural network (DANN). The results demonstrate that the proposed framework achieves higher correlation coefficients R and lower Bias between the predicted and measured parameter values, along with smaller standard deviations in the estimations, thereby validating their effectiveness in bridging the domain gap between simulated and real-world radar signals and enabling efficient subsurface material property retrieval.

Via

Access Paper or Ask Questions

TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation

Oct 08, 2025

Jiaben Chen, Zixin Wang, Ailing Zeng, Yang Fu, Xueyang Yu, Siyuan Cen, Julian Tanke, Yihang Chen, Koichi Saito, Yuki Mitsufuji(+1 more)

Abstract:In this work, we present TalkCuts, a large-scale dataset designed to facilitate the study of multi-shot human speech video generation. Unlike existing datasets that focus on single-shot, static viewpoints, TalkCuts offers 164k clips totaling over 500 hours of high-quality human speech videos with diverse camera shots, including close-up, half-body, and full-body views. The dataset includes detailed textual descriptions, 2D keypoints and 3D SMPL-X motion annotations, covering over 10k identities, enabling multimodal learning and evaluation. As a first attempt to showcase the value of the dataset, we present Orator, an LLM-guided multi-modal generation framework as a simple baseline, where the language model functions as a multi-faceted director, orchestrating detailed specifications for camera transitions, speaker gesticulations, and vocal modulation. This architecture enables the synthesis of coherent long-form videos through our integrated multi-modal video generation module. Extensive experiments in both pose-guided and audio-driven settings show that training on TalkCuts significantly enhances the cinematographic coherence and visual appeal of generated multi-shot speech videos. We believe TalkCuts provides a strong foundation for future work in controllable, multi-shot speech video generation and broader multimodal learning.

* Project page: https://talkcuts.github.io/

Via

Access Paper or Ask Questions

EnergyPatchTST: Multi-scale Time Series Transformers with Uncertainty Estimation for Energy Forecasting

Aug 07, 2025

Wei Li, Zixin Wang, Qizheng Sun, Qixiang Gao, Fenglei Yang

Abstract:Accurate and reliable energy time series prediction is of great significance for power generation planning and allocation. At present, deep learning time series prediction has become the mainstream method. However, the multi-scale time dynamics and the irregularity of real data lead to the limitations of the existing methods. Therefore, we propose EnergyPatchTST, which is an extension of the Patch Time Series Transformer specially designed for energy forecasting. The main innovations of our method are as follows: (1) multi-scale feature extraction mechanism to capture patterns with different time resolutions; (2) probability prediction framework to estimate uncertainty through Monte Carlo elimination; (3) integration path of future known variables (such as temperature and wind conditions); And (4) Pre-training and Fine-tuning examples to enhance the performance of limited energy data sets. A series of experiments on common energy data sets show that EnergyPatchTST is superior to other commonly used methods, the prediction error is reduced by 7-12%, and reliable uncertainty estimation is provided, which provides an important reference for time series prediction in the energy field.

* Advanced Intelligent Computing Technology and Applications. ICIC 2025. Lecture Notes in Computer Science, vol 15860. Springer, Singapore
* Accepted for publication at the International Conference on Intelligent Computing (ICIC 2025). 12 pages. The final authenticated version is published in the Lecture Notes in Computer Science (LNCS) series, vol 15860, and is available online. This is the author's version of the work submitted for peer review

Via

Access Paper or Ask Questions

Edge Large AI Models: Collaborative Deployment and IoT Applications

May 06, 2025

Zixin Wang, Yuanming Shi, Khaled. B. Letaief

Figure 1 for Edge Large AI Models: Collaborative Deployment and IoT Applications

Figure 2 for Edge Large AI Models: Collaborative Deployment and IoT Applications

Figure 3 for Edge Large AI Models: Collaborative Deployment and IoT Applications

Figure 4 for Edge Large AI Models: Collaborative Deployment and IoT Applications

Abstract:Large artificial intelligence models (LAMs) emulate human-like problem-solving capabilities across diverse domains, modalities, and tasks. By leveraging the communication and computation resources of geographically distributed edge devices, edge LAMs enable real-time intelligent services at the network edge. Unlike conventional edge AI, which relies on small or moderate-sized models for direct feature-to-prediction mappings, edge LAMs leverage the intricate coordination of modular components to enable context-aware generative tasks and multi-modal inference. We shall propose a collaborative deployment framework for edge LAM by characterizing the LAM intelligent capabilities and limited edge network resources. Specifically, we propose a collaborative training framework over heterogeneous edge networks that adaptively decomposes LAMs according to computation resources, data modalities, and training objectives, reducing communication and computation overheads during the fine-tuning process. Furthermore, we introduce a microservice-based inference framework that virtualizes the functional modules of edge LAMs according to their architectural characteristics, thereby improving resource utilization and reducing inference latency. The developed edge LAM will provide actionable solutions to enable diversified Internet-of-Things (IoT) applications, facilitated by constructing mappings from diverse sensor data to token representations and fine-tuning based on domain knowledge.

Via

Access Paper or Ask Questions

Edge Large AI Models: Revolutionizing 6G Networks

May 01, 2025

Zixin Wang, Yuanming Shi, Yong Zhou, Jingyang Zhu, Khaled. B. Letaief

Abstract:Large artificial intelligence models (LAMs) possess human-like abilities to solve a wide range of real-world problems, exemplifying the potential of experts in various domains and modalities. By leveraging the communication and computation capabilities of geographically dispersed edge devices, edge LAM emerges as an enabling technology to empower the delivery of various real-time intelligent services in 6G. Unlike traditional edge artificial intelligence (AI) that primarily supports a single task using small models, edge LAM is featured by the need of the decomposition and distributed deployment of large models, and the ability to support highly generalized and diverse tasks. However, due to limited communication, computation, and storage resources over wireless networks, the vast number of trainable neurons and the substantial communication overhead pose a formidable hurdle to the practical deployment of edge LAMs. In this paper, we investigate the opportunities and challenges of edge LAMs from the perspectives of model decomposition and resource management. Specifically, we propose collaborative fine-tuning and full-parameter training frameworks, alongside a microservice-assisted inference architecture, to enhance the deployment of edge LAM over wireless networks. Additionally, we investigate the application of edge LAM in air-interface designs, focusing on channel prediction and beamforming. These innovative frameworks and applications offer valuable insights and solutions for advancing 6G technology.

Via

Access Paper or Ask Questions

Mutual Information-Empowered Task-Oriented Communication: Principles, Applications and Challenges

Mar 26, 2025

Hongru Li, Songjie Xie, Jiawei Shao, Zixin Wang, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

Figure 1 for Mutual Information-Empowered Task-Oriented Communication: Principles, Applications and Challenges

Figure 2 for Mutual Information-Empowered Task-Oriented Communication: Principles, Applications and Challenges

Figure 3 for Mutual Information-Empowered Task-Oriented Communication: Principles, Applications and Challenges

Figure 4 for Mutual Information-Empowered Task-Oriented Communication: Principles, Applications and Challenges

Abstract:Mutual information (MI)-based guidelines have recently proven to be effective for designing task-oriented communication systems, where the ultimate goal is to extract and transmit task-relevant information for downstream task. This paper provides a comprehensive overview of MI-empowered task-oriented communication, highlighting how MI-based methods can serve as a unifying design framework in various task-oriented communication scenarios. We begin with the roadmap of MI for designing task-oriented communication systems, and then introduce the roles and applications of MI to guide feature encoding, transmission optimization, and efficient training with two case studies. We further elaborate the limitations and challenges of MI-based methods. Finally, we identify several open issues in MI-based task-oriented communication to inspire future research.

* 8 pages,5 figures, submitted to IEEE for potential publication

Via

Access Paper or Ask Questions