This paper proposes to leverage the emerging~learning techniques and devise a multi-agent online source {seeking} algorithm under unknown environment. Of particular significance in our problem setups are: i) the underlying environment is not only unknown, but dynamically changing and also perturbed by two types of non-stochastic disturbances; and ii) a group of agents is deployed and expected to cooperatively seek as many sources as possible. Correspondingly, a new technique of discounted Kalman filter is developed to tackle with the non-stochastic disturbances, and a notion of confidence bound in polytope nature is utilized~to aid the computation-efficient cooperation among~multiple agents. With standard assumptions on the unknown environment as well as the disturbances, our algorithm is shown to achieve sub-linear regrets under the two~types of non-stochastic disturbances; both results are comparable to the state-of-the-art. Numerical examples on a real-world pollution monitoring application are provided to demonstrate the effectiveness of our algorithm.
This paper presents an algorithmic framework for the distributed on-line source seeking, termed as 'DoSS', with a multi-robot system in an unknown dynamical environment. Our algorithm, building on a novel concept called dummy confidence upper bound (D-UCB), integrates both estimation of the unknown environment and task planning for the multiple robots simultaneously, and as a result, drives the team of robots to a steady state in which multiple sources of interest are located. Unlike the standard UCB algorithm in the context of multi-armed bandits, the introduction of D-UCB significantly reduces the computational complexity in solving subproblems of the multi-robot task planning. This also enables our 'DoSS' algorithm to be implementable in a distributed on-line manner. The performance of the algorithm is theoretically guaranteed by showing a sub-linear upper bound of the cumulative regret. Numerical results on a real-world methane emission seeking problem are also provided to demonstrate the effectiveness of the proposed algorithm.