This document reviews the definition of the kernel distance, providing a gentle introduction tailored to a reader with background in theoretical computer science, but limited exposure to technology more common to machine learning, functional analysis and geometric measure theory. The key aspect of the kernel distance developed here is its interpretation as an L_2 distance between probability measures or various shapes (e.g. point sets, curves, surfaces) embedded in a vector space (specifically an RKHS). This structure enables several elegant and efficient solutions to data analysis problems. We conclude with a glimpse into the mathematical underpinnings of this measure, highlighting its recent independent evolution in two separate fields.