Laboratory of Ocean acoustics and Remote Sensing, Institute of Oceanography, Ministry of Natural Resources, Xiamen, Fujian, China
Abstract:Text-driven video editing aims to modify video content according to natural language instructions. While recent training-free approaches have made progress by leveraging pre-trained diffusion models, they typically rely on inversion-based techniques that map input videos into the latent space, which often leads to temporal inconsistencies and degraded structural fidelity. To address this, we propose FlowDirector, a novel inversion-free video editing framework. Our framework models the editing process as a direct evolution in data space, guiding the video via an Ordinary Differential Equation (ODE) to smoothly transition along its inherent spatiotemporal manifold, thereby preserving temporal coherence and structural details. To achieve localized and controllable edits, we introduce an attention-guided masking mechanism that modulates the ODE velocity field, preserving non-target regions both spatially and temporally. Furthermore, to address incomplete edits and enhance semantic alignment with editing instructions, we present a guidance-enhanced editing strategy inspired by Classifier-Free Guidance, which leverages differential signals between multiple candidate flows to steer the editing trajectory toward stronger semantic alignment without compromising structural consistency. Extensive experiments across benchmarks demonstrate that FlowDirector achieves state-of-the-art performance in instruction adherence, temporal consistency, and background preservation, establishing a new paradigm for efficient and coherent video editing without inversion.
Abstract:The dual-channel sound speed profiles of the Chukchi Plateau and the Canadian Basin have become current research hotspots due to their excellent low-frequency sound signal propagation ability. Previous research has mainly focused on using sound propagation theory to explain the changes in sound signal energy. This article is mainly based on the theory of normal modes to study the fine structure of low-frequency wide-band sound propagation dispersion under dual-channel sound speed profiles. In this paper, the problem of the intersection of normal mode dispersion curves caused by the dual-channel sound speed profile (SSP) has been explained, the blocking effect of seabed terrain changes on dispersion structures has been analyzed, and the normal modes has been separated by using modified warping operator. The above research results have been verified through a long-range seismic exploration experiment at the Chukchi Plateau. At the same time, based on the acoustic signal characteristics in this environment, two methods for estimating the distance of sound sources have been proposed, and the experiment data at sea has also verified these two methods.