As the successor of H.265/HEVC, the new versatile video coding standard (H.266/VVC) can provide up to 50% bitrate saving with the same subjective quality, at the cost of increased decoding complexity. To accelerate the application of the new coding standard, a real-time H.266/VVC software decoder that can support various platforms is implemented, where SIMD technologies, parallelism optimization, and the acceleration strategies based on the characteristics of each coding tool are applied. As the mobile devices have become an essential carrier for video services nowadays, the mentioned optimization efforts are not only implemented for the x86 platform, but more importantly utilized to highly optimize the decoding performance on the ARM platform in this work. The experimental results show that when running on the Apple A14 SoC (iPhone 12pro), the average single-thread decoding speed of the present implementation can achieve 53fps (RA and LB) for full HD (1080p) bitstreams generated by VTM-11.0 reference software using 8bit Common Test Conditions (CTC). When multi-threading is enabled, an average of 32 fps (RA) can be achieved when decoding the 4K bitstreams.