Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mingjun Liu

ASK: Adaptive Self-improving Knowledge Framework for Audio Text Retrieval

Dec 11, 2025

Siyuan Fu, Xuchen Guo, Mingjun Liu, Hongxiang Li, Boyin Tan, Gongxi Zhu, Xianwei Zhuang, Jinghan Ru, Yuxin Xie, Yuguo Yin

Figure 1 for ASK: Adaptive Self-improving Knowledge Framework for Audio Text Retrieval

Figure 2 for ASK: Adaptive Self-improving Knowledge Framework for Audio Text Retrieval

Figure 3 for ASK: Adaptive Self-improving Knowledge Framework for Audio Text Retrieval

Figure 4 for ASK: Adaptive Self-improving Knowledge Framework for Audio Text Retrieval

Abstract:The dominant paradigm for Audio-Text Retrieval (ATR) relies on mini-batch-based contrastive learning. This process, however, is inherently limited by what we formalize as the Gradient Locality Bottleneck (GLB), which structurally prevents models from leveraging out-of-batch knowledge and thus impairs fine-grained and long-tail learning. While external knowledge-enhanced methods can alleviate the GLB, we identify a critical, unaddressed side effect: the Representation-Drift Mismatch (RDM), where a static knowledge base becomes progressively misaligned with the evolving model, turning guidance into noise. To address this dual challenge, we propose the Adaptive Self-improving Knowledge (ASK) framework, a model-agnostic, plug-and-play solution. ASK breaks the GLB via multi-grained knowledge injection, systematically mitigates RDM through dynamic knowledge refinement, and introduces a novel adaptive reliability weighting scheme to ensure consistent knowledge contributes to optimization. Experimental results on two benchmark datasets with superior, state-of-the-art performance justify the efficacy of our proposed ASK framework.

Via

Access Paper or Ask Questions

E2E Parking Dataset: An Open Benchmark for End-to-End Autonomous Parking

Apr 15, 2025

Kejia Gao, Liguo Zhou, Mingjun Liu, Alois Knoll

Abstract:End-to-end learning has shown great potential in autonomous parking, yet the lack of publicly available datasets limits reproducibility and benchmarking. While prior work introduced a visual-based parking model and a pipeline for data generation, training, and close-loop test, the dataset itself was not released. To bridge this gap, we create and open-source a high-quality dataset for end-to-end autonomous parking. Using the original model, we achieve an overall success rate of 85.16% with lower average position and orientation errors (0.24 meters and 0.34 degrees).

Via

Access Paper or Ask Questions