Vision-aided wireless sensing is emerging as a cornerstone of 6G mobile computing. While data-driven approaches have advanced rapidly, establishing a precise geometric correspondence between ego-centric visual data and radio propagation remains a challenge. Existing paradigms typically either associate 2D topology maps and auxiliary information with radio maps, or provide 3D perspective views limited by sparse radio data. This spatial representation flattens the complex vertical interactions such as occlusion and diffraction that govern signal behavior in urban environments, rendering the task of cross-view signal inference mathematically ill-posed. To resolve this geometric ambiguity, we introduce SynthRM, a scalable synthetic data platform. SynthRM implements a Visible-Aligned-Surface simulation strategy: rather than probing a global volumetric grid, it performs ray-tracing directly onto the geometry exposed to the sensor. This approach ensures pixel-level consistency between visual semantics and electromagnetic response, transforming the learning objective into a physically well-posed problem. We demonstrate the platform's capabilities by presenting a diverse, city-scale dataset derived from procedurally generated environments. By combining efficient procedural synthesis with high-fidelity electromagnetic modeling, SynthRM provides a transparent, accessible foundation for developing next-generation mobile systems for environment-aware sensing and communication.