In this paper, we investigate the problem of decentralized federated learning (DFL) in Internet of things (IoT) systems, where a number of IoT clients train models collectively for a common task without sharing their private training data in the absence of a central server. Most of the existing DFL schemes are composed of two alternating steps, i.e., gradient update and model averaging. However, averaging of model parameters directly to fuse different models at the local clients suffers from client-drift in the local updates especially when the training data are heterogeneous across different clients. This leads to slow convergence and degraded learning performance. As a possible solution, we propose the decentralized federated learning via mutual knowledge transfer (Def-KT) algorithm where local clients fuse models by transferring their learnt knowledge to each other. Our experiments on the MNIST, Fashion-MNIST, and CIFAR10 datasets reveal that the proposed Def-KT algorithm significantly outperforms the baseline DFL methods with model averaging, i.e., Combo and FullAvg, especially when the training data are not independent and identically distributed (non-IID) across different clients.