Abstract:This paper develops a deep reinforcement learning based observer control policy for autonomous bearings-only tracking of a moving target. The observer manoeuvre problem is formulated as a belief Markov decision process, where the belief state is represented by the posterior of a cubature Kalman filter (CKF). The reward function is designed to address two conflicting objectives: minimising the absolute target position estimation error (Euclidean distance) and maintaining CKF estimation consistency (Mahalanobis distance). The reward is formulated as a geometric interpolation between the two objectives on the Pareto front, parametrised by a weighting factor $β\in [0,1]$. The policy is implemented as a deep Q-network (DQN) trained over 50,000 episodes. Performance is evaluated over 5,000 Monte Carlo episodes and compared against two baselines: the perpendicular-to-bearing heuristic and the D-optimal Fisher information maximisation criterion. The results show that the DQN policy at $β= 0.7$ achieves the best trade-off between accuracy and robustness: it matches the information-theoretic baseline on mean tracking accuracy while reducing the worst-case error by nearly a factor of ten, owing to the implicit filter-consistency regularisation provided by the Mahalanobis term in the reward.
Abstract:Statistical dependencies between information sources are rarely known, yet in practical distributed tracking schemes, they must be taken into account in order to prevent track divergences. Chernoff fusion is well-known and universally accepted method that can address the problem of track fusion when the statistical dependence between the fusing sources is unknown. In this paper we derive the exact Chernoff fusion equations for Bernoulli Gaussian max filters. These filters have been recently derived in the framework of possibility theory, as the analog of the Bernoulli Gaussian sum filters. The main motivation for the possibilistic approach is that it effectively deals with imprecise mathematical models (e.g. dynamics, measurements) used in tracking algorithms. The paper also demonstrates the proposed possibilistic fusion scheme in the absence of knowledge about statistical dependence.




Abstract:Contemporary undertakings provide limitless opportunities for widespread application of machine reasoning and artificial intelligence in situations characterised by uncertainty, hostility and sheer volume of data. The paper develops a valuation network as a graphical system for higher-level fusion and reasoning under uncertainty in support of the human operators. Valuations, which are mathematical representation of (uncertain) knowledge and collected data, are expressed as credal sets, defined as coherent interval probabilities in the framework of imprecise probability theory. The basic operations with such credal sets, combination and marginalisation, are defined to satisfy the axioms of a valuation algebra. A practical implementation of the credal valuation network is discussed and its utility demonstrated on a small scale example.




Abstract:The paper presents an approach to olfactory search for a diffusive emitting source of tracer (e.g. aerosol, gas) in an environment with unknown map of randomly placed and shaped obstacles. The measurements of tracer concentration are sporadic, noisy and without directional information. The search domain is discretised and modelled by a finite two-dimensional lattice. The links is the lattice represent the traversable paths for emitted particles and for the searcher. A missing link in the lattice indicates a blocked paths, due to the walls or obstacles. The searcher must simultaneously estimate the source parameters, the map of the search domain and its own location within the map. The solution is formulated in the sequential Bayesian framework and implemented as a Rao-Blackwellised particle filter with information-driven motion control. The numerical results demonstrate the concept and its performance.




Abstract:The paper evaluates the error performance of three random finite set based multi-object trackers in the context of pedestrian video tracking. The evaluation is carried out using a publicly available video dataset of 4500 frames (town centre street) for which the ground truth is available. The input to all pedestrian tracking algorithms is an identical set of head and body detections, obtained using the Histogram of Oriented Gradients (HOG) detector. The tracking error is measured using the recently proposed OSPA metric for tracks, adopted as the only known mathematically rigorous metric for measuring the distance between two sets of tracks. A comparative analysis is presented under various conditions.