HUAWEI
Abstract:Short-term ocean forecast skill depends strongly on the three-dimensional ocean structure of the upper ocean, which governs stratification, subsurface heat storage, and the response of the ocean to atmospheric forcing. However, AI ocean forecasting models often fail to preserve this vertical structure, resulting in over-smoothed subsurface features and weak physical consistency under strong forcing. Here, we present AxiomOcean, a global AI ocean forecasting model that explicitly represents vertical hierarchy and cross-layer dependence within the water column. By combining a fully three-dimensional encoder-backbone-decoder architecture with surface atmospheric forcing, AxiomOcean jointly predicts upper-ocean temperature, salinity, and three-dimensional currents at global 1/12° resolution down to 643 m depth. In 10-day forecasts, AxiomOcean outperforms an advanced AI comparison model across variables and lead times, reducing day-1 RMSE by approximately 20 to 35% while maintaining higher anomaly correlation. The gain is not achieved through excessive smoothing: AxiomOcean better preserves eddy kinetic energy, temperature and salinity variance. Its advantage also extends through the water column and remains evident across the equatorial Pacific, Kuroshio Extension, and Southern Ocean, yielding a more realistic reconstruction of upper-ocean heat content. These results show that explicitly preserving upper-ocean three-dimensional structure can improve both forecast accuracy and physical fidelity in AI ocean prediction.
Abstract:Recent advancements in large language models (LLMs) have catalyzed the rise of reasoning-intensive inference paradigms, where models perform explicit step-by-step reasoning before generating final answers. While such approaches improve answer quality and interpretability, they incur substantial computational overhead due to the prolonged generation sequences. In this paper, we propose Tandem, a novel collaborative framework that synergizes large and small language models (LLMs and SLMs) to achieve high-quality reasoning with significantly reduced computational cost. Specifically, the LLM serves as a strategic coordinator, efficiently generating a compact set of critical reasoning insights. These insights are then used to guide a smaller, more efficient SLM in executing the full reasoning process and delivering the final response. To balance efficiency and reliability, Tandem introduces a cost-aware termination mechanism that adaptively determines when sufficient reasoning guidance has been accumulated, enabling early stopping of the LLM's generation. Experiments on mathematical reasoning and code generation benchmarks demonstrate that Tandem reduces computational costs by approximately 40% compared to standalone LLM reasoning, while achieving superior or competitive performance. Furthermore, the sufficiency classifier trained on one domain transfers effectively to others without retraining. The code is available at: https://github.com/Applied-Machine-Learning-Lab/ACL2026_Tandem.
Abstract:Long-term conversational agents need memory systems that capture relationships between events, not merely isolated facts, to support temporal reasoning and multi-hop question answering. Current approaches face a fundamental trade-off: flat memory is efficient but fails to model relational structure, while graph-based memory enables structured reasoning at the cost of expensive and fragile construction. To address these issues, we propose \textbf{StructMem}, a structure-enriched hierarchical memory framework that preserves event-level bindings and induces cross-event connections. By temporally anchoring dual perspectives and performing periodic semantic consolidation, StructMem improves temporal reasoning and multi-hop performance on \texttt{LoCoMo}, while substantially reducing token usage, API calls, and runtime compared to prior memory systems, see https://github.com/zjunlp/LightMem .
Abstract:This work investigates antenna coding optimization to enhance the channel capacity of single-input single-output orthogonal frequency division multiplexing (SISO-OFDM) systems empowered by highly reconfigurable pixel antennas. We first introduce the model for pixel antenna empowered SISO-OFDM systems using a beamspace channel representation. We next formulate the problem to maximize the channel capacity through jointly optimizing antenna coding and the power allocation across subcarriers and solve it by Successive Exhaustive Boolean Optimization (SEBO) and water-filling (WF) algorithm. To reduce computational complexity, a codebook-based approach is also proposed for antenna coding optimization. Simulation results show that the channel capacity of SISO-OFDM system across all signal-to-noise-ratio (SNR) regions considered can be enhanced through leveraging pixel antennas as compared to using conventional antenna with fixed configuration. This result demonstrates the effectiveness of antenna coding technology empowered by pixel antenna in enhancing SISO-OFDM systems.
Abstract:We investigate antenna coding utilizing pixel antennas as a new degree of freedom for enhancing multiple-input multiple-output (MIMO) wireless power transfer (WPT) systems. The objective is to enhance the output direct current (DC) power under RF combining and DC combining schemes by jointly exploiting gains from antenna coding, beamforming, and rectenna nonlinearity. We first propose the MIMO WPT system model with binary and continuous antenna coding using the beamspace channel model and formulate the joint antenna coding and beamforming optimization using a nonlinear rectenna model. We propose two efficient closed-form successive convex approximation algorithms to efficiently optimize the beamforming. To further reduce the computational complexity, we propose codebook-based antenna coding designs for output DC power maximization based on K-means clustering. Results show that the proposed pixel antenna empowered MIMO WPT system with binary antenna coding increases output DC power by more than 15 dB compared with conventional systems with fixed antenna configuration. With continuous antenna coding, the performance improves another 6 dB. Moreover, the proposed codebook design outperforms previous designs by up to 40% and shows good performance with reduced computational complexity. Overall, the significant improvement in output DC power verifies the potential of leveraging antenna coding utilizing pixel antennas to enhance WPT systems.
Abstract:This paper investigates antenna coding based on pixel antennas as a new degree of freedom for enhancing multiple-input multiple-output (MIMO) wireless power transfer (WPT) systems. Antenna coding is closely related to the Fluid Antenna System (FAS) concept and further generalizes the radiation pattern reconfigurability. We first introduce a beamspace channel model to demonstrate reconfigurable radiation patterns enabled by antenna coders. By jointly optimizing the antenna coding and transmit beamforming with perfect channel state information (CSI), we exploit gains from antenna coding, transmit beamforming, and rectenna nonlinearity to maximize the output DC power. We adopt an alternating optimization approach with the quasi-Newton method and Successive Exhaustive Boolean Optimization (SEBO) method with warm-start to handle the transmit beamforming design and antenna coding design respectively. Finally, simulation results show that the proposed MIMO WPT system with pixel antennas achieves up to 15 dB gain in average output DC power compared with a conventional system with fixed antenna configuration, highlighting the potential of pixel antennas for boosting the WPT efficiency.




Abstract:The number of parameters in large-scale language models based on transformers is gradually increasing, and the scale of computing clusters is also growing. The technology of quickly mobilizing large amounts of computing resources for parallel computing is becoming increasingly important. In this paper, we propose an automatic parallel algorithm that automatically plans the parallel strategy with maximum throughput based on model and hardware information. By decoupling the training time into computation, communication, and overlap, we established a training duration simulation model. Based on this simulation model, we prune the parallel solution space to shorten the search time required. The multi-node experiment results show that the algorithm can estimate the parallel training duration in real time with an average accuracy of 96%. In our test, the recommendation strategy provided by the algorithm is always globally optimal.