Social infrastructure and other built environments are increasingly expected to support well-being and community resilience by enabling social interaction. Yet in civil and built-environment research, there is no consistent and privacy-preserving way to represent and measure socially meaningful interaction in these spaces, leaving studies to operationalize "interaction" differently across contexts and limiting practitioners' ability to evaluate whether design interventions are changing the forms of interaction that social capital theory predicts should matter. To address this field-level and methodological gap, we introduce the Dyadic User Engagement DataseT (DUET) dataset and an embedded kinesics recognition framework that operationalize Ekman and Friesen's kinesics taxonomy as a function-level interaction vocabulary aligned with social capital-relevant behaviors (e.g., reciprocity and attention coordination). DUET captures 12 dyadic interactions spanning all five kinesic functions-emblems, illustrators, affect displays, adaptors, and regulators-across four sensing modalities and three built-environment contexts, enabling privacy-preserving analysis of communicative intent through movement. Benchmarking six open-source, state-of-the-art human activity recognition models quantifies the difficulty of communicative-function recognition on DUET and highlights the limitations of ubiquitous monadic, action-level recognition when extended to dyadic, socially grounded interaction measurement. Building on DUET, our recognition framework infers communicative function directly from privacy-preserving skeletal motion without handcrafted action-to-function dictionaries; using a transfer-learning architecture, it reveals structured clustering of kinesic functions and a strong association between representation quality and classification performance while generalizing across subjects and contexts.