Abstract:Many advanced Large Language Model (LLM) applications require long-context processing, but the self-attention module becomes a bottleneck during the prefilling stage of inference due to its quadratic time complexity with respect to sequence length. Existing sparse attention methods accelerate attention computation by skipping less significant regions of the attention map. However, these approaches typically perform coarse-grained inspection of the attention map, rendering considerable loss in model accuracy. In this paper, we propose SALE, a fine-grained sparse attention method that accelerates the long-context prefilling stage of LLM with negligible loss in model accuracy. SALE achieves fast and accurate fine-grained attention weight estimation through 4-bit quantized query-key products, followed by block-sparse attention to accelerate prefilling computations. For importance evaluation for query-key pairs, we adopt our Relative Attention Score metric, which offers significantly higher efficiency within our framework. We implement a custom CUDA kernel optimized for our approach for hardware efficiency, reducing the additional overhead to approximately 11% of the full attention latency. Notably, SALE requires no parameter training and can be seamlessly integrated into existing systems with trivial code modifications. Experiments on long-context benchmarks demonstrate that our method outperforms existing approaches in accuracy-efficiency trade-offs, achieving at least 3.36x speedups on Llama-3.1-8B for sequences longer than 64K while maintaining model quality.
Abstract:We present MiMo-7B, a large language model born for reasoning tasks, with optimization across both pre-training and post-training stages. During pre-training, we enhance the data preprocessing pipeline and employ a three-stage data mixing strategy to strengthen the base model's reasoning potential. MiMo-7B-Base is pre-trained on 25 trillion tokens, with additional Multi-Token Prediction objective for enhanced performance and accelerated inference speed. During post-training, we curate a dataset of 130K verifiable mathematics and programming problems for reinforcement learning, integrating a test-difficulty-driven code-reward scheme to alleviate sparse-reward issues and employing strategic data resampling to stabilize training. Extensive evaluations show that MiMo-7B-Base possesses exceptional reasoning potential, outperforming even much larger 32B models. The final RL-tuned model, MiMo-7B-RL, achieves superior performance on mathematics, code and general reasoning tasks, surpassing the performance of OpenAI o1-mini. The model checkpoints are available at https://github.com/xiaomimimo/MiMo.
Abstract:The plane wave based wireless communications have becoming more and more matured, along with the well utilization of the traditional resources such as time and frequency. To further increase the capacity for rapidly increasing capacity demand of wireless communications, it is potential to use the twist wave, which has the orbital angular momentum (OAM). In this paper, we discuss the OAM based wireless communications in the aspect of orthogonality, degree of freedom (DoF), and capacity, where both the transmitter and the receiver use uniform circular array (UCA) antennas. In particular, we compare OAM based wireless communications with multiple-input-multiple-output (MIMO) based wireless communications in terms of DoF and capacity. Numerical results are presented to validate and evaluate that the DoF of OAM based wireless communications is greater than or equal to that of correlated MIMO based wireless communications when the transmitter and the receiver antennas are aligned well. The OAM based wireless communications can achieve larger capacity than the correlated MIMO in high signal-to-noise ratio (SNR) region under line-of-sight scenario.
Abstract:Orbital angular momentum (OAM) has attracted much attention for radio vortex wireless communications due to the orthogonality among different OAM-modes. To maintain the orthogonality among different OAM modes at the receiver, the strict alignment between transmit and receive antennas is highly demanded. However, it is not practical to guarantee the transceiver alignment in wireless communications. The phase turbulence, resulting from the misaligned transceivers, leads to serious inter-mode interference among different OAM modes and therefore fail for signals detection of multiple OAM modes at the receiver. To achieve practical OAM based wireless communications, in this paper we investigate the radio vortex wireless communications with misaligned transmit and receive antennas. We propose a joint Beamforming and Pre-detection (BePre) scheme, which uses two unitary matrices to convert the channel matrix into the equivalent circulant matrix for keeping the orthogonality among OAM-modes at the receiver. Then, the OAM signals can be detected with the mode-decomposition scheme at the misaligned receiver. Extensive simulations obtained validate and evaluate that our developed joint BePre scheme can efficiently detect the signals of multiple OAM-modes for the misaligned transceiver and can significantly increase the spectrum efficiency.
Abstract:By enabling very high bandwidth for radio communications, the millimeter-wave (mmWave), which can easily be integrated with massive-multiple-input-multiple-output (massive-MIMO) due to small antenna size, has been attracting growing attention as a candidate for the fifth-generation (5G) and 5G-beyond wireless communications networks. On the other hand, the communication over the orthogonal states/modes of orbital angular momentum (OAM) is a subset of the solutions offered by massive-MIMO communications. Traditional massive-MIMO based mmWave communications did not concern the potential spectrum-efficiency-gain (SE-gain) offered by orthogonal states of OAM. However, the highly expecting maximum SE-gain for OAM and massive-MIMO communications is the product of SE-gains offered by OAM and multiplexing-MIMO. In this paper, we propose the OAM-embedded-MIMO (OEM) communication framework to obtain the multiplicative SE-gain for joint OAM and massive-MIMO based mmWave wireless communications. We design the parabolic antenna for each uniform circular array antenna to converge OAM signals. Then, we develop the mode-decomposition and multiplexing-detection scheme to obtain the transmit signal on each OAM-mode of each transmit antenna. Also, we develop the OEM-water-filling power allocation policy to achieve the maximum multiplicative SE-gain for OEM communications. The extensive simulations obtained validate and evaluate our developed parabolic antenna based converging method, mode-decomposition and multiplexing-detection scheme, and OEM-water-filling policy, showing that our proposed OEM mmWave communications can significantly increase the spectrum-efficiency as compared with traditional massive-MIMO based mmWave communications.
Abstract:The development of orbital angular momentum (OAM)-based radio vortex transmission presents a promising opportunity for increasing the capacity of wireless communication in correlated channels due to its inherent orthogonality among different OAM modes. One of the most popular schemes for high-efficient OAM transmission is the digital baseband associated with uniform circular array (UCA) based transceiver. However, the periodicity of complex-exponential feed makes the maximum number of orthogonal signals carried by multiple OAM modes generally restricted to the array-element number of UCA antenna, which poses an open question of how to employ more OAM modes given a fixed number of array elements. Furthermore, signals modulated with high-order OAM modes are difficult to be captured by the receiver due to their serious divergence as propagating in free space, thus severely limiting the capacity of radio vortex communications. To overcome the above challenges, in this paper based on the partly element-overlapped fractal geometry layout and effectively using low-order OAM modes, we propose the quasi-fractal UCA (QF-UCA) antenna based OAM multiplexing transmission. We perform the two-dimension OAM modulation (TOM) and demodulation (TOD) schemes with the orthogonal OAM mode number exceeding the array-element number, which is beyond the traditional concept of multiple antennas based wireless communications. Simulation results show that our proposed scheme can achieve more number of orthogonal multiplexing streams than the maximum number of orthogonal multiplexing corresponding to traditional multiple antenna systems.
Abstract:For unforeseen emergencies, such as natural disasters and pandemic events, it is highly demanded to cope with the explosive growth of mobile data traffic in extremely critical environments. An Unmanned aerial vehicle (UAV) fleet is an effective way to facilitate the Emergency wireless COmmunication NETwork (EcoNet). In this article, a MUlti-tier Heterogeneous UAV Network (MuHun), which is with different UAV fleets in different altitudes, is proposed to flexibly serve various emergencies. We refresh the key performance indicators of full coverage, network capacity, low latency, and energy efficiency in harsh environments. Then, we present the special challenges regarding shadowing-dominated complex channel model, energy supply limited short-endurance, various communication mechanisms coexistence, and communication island for underground users in UAV-based EcoNet, followed by the MuHun-based EcoNet architecture and its advantages. Furthermore, some potential solutions such as the new hybrid-channel adapted resource allocation, reconfigurable intelligent surface assisted UAV communications, competitive heterogenous-networks, and magnetic induction based air-to-ground/underground communications are discussed to effectively achieve full coverage, high capacity, high energy efficiency, and diverse qualities of services for EcoNets in harsh environments.
Abstract:In this paper, we propose a virtual full-duplex (VFD) technique with zero-interval modulation and sampling (ZIMS), where two half-duplex (HD) transceivers can simultaneously transmit signals and each transceiver can effectively receive the desired information. In ZIMS-VFD, the transceiver inserts a zero-interval for each symbol in the transmit signal and provides self-interference (SI)-free intervals for itself. Meanwhile, it samples the receive signal in the provided SI-free intervals and restores the desired symbols. Based on orthogonal frequency division multiplexing (OFDM), we formulate the system model and show the transmit signal structure. Then, we give the transceiver design for single input single output (SISO) ZIMS-VFD and extend it to multiple input multiple output (MIMO) communications. Numerical results verify our theoretical analyses and show that ZIMS-VFD can effectively increase the capacity and approach the FD without SI.
Abstract:For unforeseen natural disasters, such as earthquakes, hurricanes, and floods, etc., the traditional communication infrastructure is unavailable or seriously disrupted along with persistent secondary disasters. Under such circumstances, it is highly demanded to deploy emergency wireless communication (EWC) networks to restore connectivity in accident/incident areas. The emerging fifth-generation (5G)/beyond-5G (B5G) wireless communication system, like unmanned aerial vehicle (UAV) assisted networks and intelligent reflecting surface (IRS) based communication systems, are expected to be designed or re-farmed for supporting temporary high quality communications in post-disaster areas. However, the channel characteristics of post-disaster areas quickly change as the secondary disaster resulted topographical changes, imposing new but critical challenges for EWC networks. In this paper, we propose a novel heterogeneous $\mathcal{F}$ composite fading channel model for EWC networks which accurately models and characterizes the composite fading channel with reflectors, path-loss exponent, fading, and shadowing parameters in 5G-UAV based EWC networks. Based on the model, we develop the optimal power allocation scheme with the simple closed-form expression and the numerical results based optimal joint bandwidth-power allocation scheme. We derive the corresponding capacities and compare the energy efficiency between IRS and traditional relay based 5G-UAVs. Numerical results show that the new heterogeneous Fisher-Snedecor $\mathcal{F}$ composite fading channel adapted resource allocation schemes can achieve higher capacity and energy efficiency than those of traditional channel model adapted resource allocation schemes, thus providing better communications service for post-disaster areas.
Abstract:The emerging orbital angular momentum (OAM) based wireless communication is expected to be a high spectrum-efficiency communication paradigm to solve the growing transmission data rate and limited bandwidth problem. Academic researchers mainly concentrate on the OAM-based line-of-sight (LoS) communications. However, there exist some surroundings around the transceiver in most practical wireless communication scenarios, thus forming multipath transmission. In this paper, a hybrid orthogonal division multiplexing (HODM) scheme by using OAM multiplexing and orthogonal frequency division multiplexing (OFDM) in conjunction is proposed to achieve high-capacity wireless communications in sparse multipath environments, where the scatterers are sparse. We first build the OAM-based wireless channel in a LoS path and several reflection paths combined sparse multipath environments. We concentrate on less than or equal to three-time reflection paths because of the severe energy attenuation. The phase difference among the channel amplitude gains of the LoS and reflection paths, which is caused by the reflection paths, makes it difficult to decompose the OAM signals. We propose the phase difference compensation to handle this problem and then calculate the corresponding capacity in radio vortex wireless communications. Numerical results illustrate that the capacity of wireless communications by using our proposed HODM scheme can be drastically increased in sparse multipath environments.