Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bastian Jäckl

Toward a Decision Support System for Energy-Efficient Ferry Operation on Lake Constance based on Optimal Control

Dec 12, 2025

Hannes Homburger, Bastian Jäckl, Stefan Wirtensohn, Christian Stopp, Maximilian T. Fischer, Moritz Diehl, Daniel A. Keim, Johannes Reuter

Abstract:The maritime sector is undergoing a disruptive technological change driven by three main factors: autonomy, decarbonization, and digital transformation. Addressing these factors necessitates a reassessment of inland vessel operations. This paper presents the design and development of a decision support system for ferry operations based on a shrinking-horizon optimal control framework. The problem formulation incorporates a mathematical model of the ferry's dynamics and environmental disturbances, specifically water currents and wind, which can significantly influence the dynamics. Real-world data and illustrative scenarios demonstrate the potential of the proposed system to effectively support ferry crews by providing real-time guidance. This enables enhanced operational efficiency while maintaining predefined maneuver durations. The findings suggest that optimal control applications hold substantial promise for advancing future ferry operations on inland waters. A video of the real-world ferry MS Insel Mainau operating on Lake Constance is available at: https://youtu.be/i1MjCdbEQyE

* 6 pages, 8 figures

Via

Access Paper or Ask Questions

Evaluating Keyframe Layouts for Visual Known-Item Search in Homogeneous Collections

Oct 05, 2025

Bastian Jäckl, Jiří Kruchina, Lucas Joos, Daniel A. Keim, Ladislav Peška, Jakub Lokoč

Abstract:Multimodal deep-learning models power interactive video retrieval by ranking keyframes in response to textual queries. Despite these advances, users must still browse ranked candidates manually to locate a target. Keyframe arrangement within the search grid highly affects browsing effectiveness and user efficiency, yet remains underexplored. We report a study with 49 participants evaluating seven keyframe layouts for the Visual Known-Item Search task. Beyond efficiency and accuracy, we relate browsing phenomena, such as overlooks, to layout characteristics. Our results show that a video-grouped layout is the most efficient, while a four-column, rank-preserving grid achieves the highest accuracy. Sorted grids reveal potentials and trade-offs, enabling rapid scanning of uninteresting regions but down-ranking relevant targets to less prominent positions, delaying first arrival times and increasing overlooks. These findings motivate hybrid designs that preserve positions of top-ranked items while sorting or grouping the remainder, and offer guidance for searching in grids beyond video retrieval.

* 28 Pages, 17 Figures

Via

Access Paper or Ask Questions

Experimental Evaluation of Static Image Sub-Region-Based Search Models Using CLIP

Jun 07, 2025

Bastian Jäckl, Vojtěch Kloda, Daniel A. Keim, Jakub Lokoč

Figure 1 for Experimental Evaluation of Static Image Sub-Region-Based Search Models Using CLIP

Figure 2 for Experimental Evaluation of Static Image Sub-Region-Based Search Models Using CLIP

Figure 3 for Experimental Evaluation of Static Image Sub-Region-Based Search Models Using CLIP

Figure 4 for Experimental Evaluation of Static Image Sub-Region-Based Search Models Using CLIP

Abstract:Advances in multimodal text-image models have enabled effective text-based querying in extensive image collections. While these models show convincing performance for everyday life scenes, querying in highly homogeneous, specialized domains remains challenging. The primary problem is that users can often provide only vague textual descriptions as they lack expert knowledge to discriminate between homogenous entities. This work investigates whether adding location-based prompts to complement these vague text queries can enhance retrieval performance. Specifically, we collected a dataset of 741 human annotations, each containing short and long textual descriptions and bounding boxes indicating regions of interest in challenging underwater scenes. Using these annotations, we evaluate the performance of CLIP when queried on various static sub-regions of images compared to the full image. Our results show that both a simple 3-by-3 partitioning and a 5-grid overlap significantly improve retrieval effectiveness and remain robust to perturbations of the annotation box.

* 14 pages, 4 figures, 2 tables

Via

Access Paper or Ask Questions

Leveraging Color Channel Independence for Improved Unsupervised Object Detection

Dec 19, 2024

Bastian Jäckl, Yannick Metz, Udo Schlegel, Daniel A. Keim, Maximilian T. Fischer

Figure 1 for Leveraging Color Channel Independence for Improved Unsupervised Object Detection

Figure 2 for Leveraging Color Channel Independence for Improved Unsupervised Object Detection

Figure 3 for Leveraging Color Channel Independence for Improved Unsupervised Object Detection

Figure 4 for Leveraging Color Channel Independence for Improved Unsupervised Object Detection

Abstract:Object-centric architectures can learn to extract distinct object representations from visual scenes, enabling downstream applications on the object level. Similarly to autoencoder-based image models, object-centric approaches have been trained on the unsupervised reconstruction loss of images encoded by RGB color spaces. In our work, we challenge the common assumption that RGB images are the optimal color space for unsupervised learning in computer vision. We discuss conceptually and empirically that other color spaces, such as HSV, bear essential characteristics for object-centric representation learning, like robustness to lighting conditions. We further show that models improve when requiring them to predict additional color channels. Specifically, we propose to transform the predicted targets to the RGB-S space, which extends RGB with HSV's saturation component and leads to markedly better reconstruction and disentanglement for five common evaluation datasets. The use of composite color spaces can be implemented with basically no computational overhead, is agnostic of the models' architecture, and is universally applicable across a wide range of visual computing tasks and training types. The findings of our approach encourage additional investigations in computer vision tasks beyond object-centric learning.

* 38 pages incl. references, 16 figures

Via

Access Paper or Ask Questions