In this paper, we consider UAVs equipped with a VLC access point and coordinated multipoint (CoMP) capability that allows users to connect to more than one UAV. UAVs can move in 3-dimensional (3D) at a constant acceleration in each master timescale, where a central server is responsible for synchronization and cooperation among UAVs. We define the data rate for each user type, CoMP and non-CoMP. The constant speed in UAVs' motion is not practical, and the effect of acceleration on the movement of UAVs is necessary to be considered. Unlike most existing works, we see the effect of variable speed on kinetic and allocation formulas. For the proposed system model, we define timescales for two different slots in which resources are allocated. In the master timescale, the acceleration of each UAV is specified, and in each short timescale, radio resources are allocated. The initial velocity in each small time slot is obtained from the previous time slot's velocity. Our goal is to formulate a multiobjective optimization problem where the total data rate is maximized and the total communication power consumption is minimized simultaneously. To deal this multiobjective optimization, we first apply the weighted method and then apply multi-agent deep deterministic policy gradient (MADDPG) which is a multi-agent method based on deep deterministic policy gradient (DDPG) that ensures more stable and faster convergence. We improve this solution method by adding two critic networks as well as allocating the two step acceleration. Simulation results indicate that the constant acceleration motion of UAVs gives about 8\% better results than conventional motion systems in terms of performance. Furthermore, CoMP supports the system to achieve an average of about 12\% higher rates comparing with non-CoMP system.