Police officer presence at an intersection discourages a potential traffic violator from violating the law. It also alerts the motorists' consciousness to take precaution and follow the rules. However, due to the abundant intersections and shortage of human resources, it is not possible to assign a police officer to every intersection. In this paper, we propose an intelligent and optimal policing strategy for traffic violation prevention. Our model consists of a specific number of targeted intersections and two police officers with no prior knowledge on the number of the traffic violations in the designated intersections. At each time interval, the proposed strategy, assigns the two police officers to different intersections such that at the end of the time horizon, maximum traffic violation prevention is achieved. Our proposed methodology adapts the PROLA (Play and Random Observe Learning Algorithm) algorithm  to achieve an optimal traffic violation prevention strategy. Finally, we conduct a case study to evaluate and demonstrate the performance of the proposed method.