Abstract:Unmanned Aerial Vehicle (UAV) swarms offer versatile applications in logistics, agriculture, and surveillance, yet controlling them requires expert knowledge for safety and feasibility. Traditional static methods limit adaptability, while Large Language Models (LLMs) enable natural language control but generate unsafe trajectories due to lacking physical grounding. This paper introduces SkySim, a ROS2-based simulation framework in Gazebo that decouples LLM high-level planning from low-level safety enforcement. Using Gemini 3.5 Pro, SkySim translates user commands (e.g., "Form a circle") into spatial waypoints, informed by real-time drone states. An Artificial Potential Field (APF) safety filter applies minimal adjustments for collision avoidance, kinematic limits, and geo-fencing, ensuring feasible execution at 20 Hz. Experiments with swarms of 3, 10, and 30 Crazyflie drones validate spatial reasoning accuracy (100% across tested geometric primitives), real-time collision prevention, and scalability. SkySim empowers non-experts to iteratively refine behaviors, bridging AI cognition with robotic safety for dynamic environments. Future work targets hardware integration.