Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qifeng Cai

LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts

May 20, 2025

Qifeng Cai, Hao Liang, Hejun Dong, Meiyi Qiang, Ruichuan An, Zhaoyang Han, Zhengzhou Zhu, Bin Cui, Wentao Zhang

Abstract:Long videos contain a vast amount of information, making video-text retrieval an essential and challenging task in multimodal learning. However, existing benchmarks suffer from limited video duration, low-quality captions, and coarse annotation granularity, which hinder the evaluation of advanced video-text retrieval methods. To address these limitations, we introduce LoVR, a benchmark specifically designed for long video-text retrieval. LoVR contains 467 long videos and over 40,804 fine-grained clips with high-quality captions. To overcome the issue of poor machine-generated annotations, we propose an efficient caption generation framework that integrates VLM automatic generation, caption quality scoring, and dynamic refinement. This pipeline improves annotation accuracy while maintaining scalability. Furthermore, we introduce a semantic fusion method to generate coherent full-video captions without losing important contextual information. Our benchmark introduces longer videos, more detailed captions, and a larger-scale dataset, presenting new challenges for video understanding and retrieval. Extensive experiments on various advanced embedding models demonstrate that LoVR is a challenging benchmark, revealing the limitations of current approaches and providing valuable insights for future research. We release the code and dataset link at https://github.com/TechNomad-ds/LoVR-benchmark

Via

Access Paper or Ask Questions

CapsuleBot: A Novel Compact Hybrid Aerial-Ground Robot with Two Actuated-wheel-rotors

Sep 17, 2023

Zhi Zheng, Qifeng Cai, Xinhang Xu, Muqing Cao, Huan Yu, Jihao Li, Guodong Lu, Jin Wang

Abstract:This paper presents the design, modeling, and experimental validation of CapsuleBot, a compact hybrid aerial-ground vehicle designed for long-term covert reconnaissance. CapsuleBot combines the manoeuvrability of bicopter in the air with the energy efficiency and noise reduction of ground vehicles on the ground. To accomplish this, a structure named actuated-wheel-rotor has been designed, utilizing a sole motor for both the unilateral rotor tilting in the bicopter configuration and the wheel movement in ground mode. CapsuleBot comes equipped with two of these structures, enabling it to attain hybrid aerial-ground propulsion with just four motors. Importantly, the decoupling of motion modes is achieved without the need for additional drivers, enhancing the versatility and robustness of the system. Furthermore, we have designed the full dynamics and control for aerial and ground locomotion based on the bicopter model and the two-wheeled self-balancing vehicle model. The performance of CapsuleBot has been validated through experiments. The results demonstrate that CapsuleBot produces 40.53% less noise in ground mode and consumes 99.35% less energy, highlighting its potential for long-term covert reconnaissance applications.

* 7 pages, 10 figures, submitted to 2024 IEEE International Conference on Robotics and Automation (ICRA). This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Roller-Quadrotor: A Novel Hybrid Terrestrial/Aerial Quadrotor with Unicycle-Driven and Rotor-Assisted Turning

Mar 02, 2023

Zhi Zheng, Jin Wang, Yuze Wu, Qifeng Cai, Huan Yu, Ruibin Zhang, Jie Tu, Jun Meng, Guodong Lu, Fei Gao

Figure 1 for Roller-Quadrotor: A Novel Hybrid Terrestrial/Aerial Quadrotor with Unicycle-Driven and Rotor-Assisted Turning

Figure 2 for Roller-Quadrotor: A Novel Hybrid Terrestrial/Aerial Quadrotor with Unicycle-Driven and Rotor-Assisted Turning

Figure 3 for Roller-Quadrotor: A Novel Hybrid Terrestrial/Aerial Quadrotor with Unicycle-Driven and Rotor-Assisted Turning

Figure 4 for Roller-Quadrotor: A Novel Hybrid Terrestrial/Aerial Quadrotor with Unicycle-Driven and Rotor-Assisted Turning

Abstract:Roller-Quadrotor is a novel hybrid terrestrial and aerial quadrotor that combines the elevated maneuverability of the quadrotor with the lengthy endurance of the ground vehicle. This work presents the design, modeling, and experimental validation of Roller-Quadrotor. Flying is achieved through a quadrotor configuration, and four actuators providing thrust. Rolling is supported by unicycle-driven and rotor-assisted turning structure. During terrestrial locomotion, the vehicle needs to overcome rolling and turning resistance, thus saving energy compared to flight mode. This work overcomes the challenging problems of general rotorcraft, reduces energy consumption and allows to through special terrain, such as narrow gaps. It also solves the obstacle avoidance challenge faced by terrestrial robots by flying. We design the models and controllers for the vehicle. The experiment results show that it can switch between aerial and terrestrial locomotion, and be able to safely pass through a narrow gap half the size of its diameter. Besides, it is capable of rolling a distance approximately 3.8 times as much as flying or operating about 42.2 times as lengthy as flying. These results demonstrate the feasibility and effectiveness of the structure and control in rolling through special terrain and energy saving.

* 8 pages, 10 figures, submitted to 2023 IEEE/RSJ International Conference on Intelligent Robots(IROS). This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions