Wave-based signal processing conventionally encodes input data into the input wavefront, making it challenging to implement non-linear operations. Programmable wave systems enable an alternative approach: encoding the input data into the scattering properties of tunable components. With such structural input encoding, two potentially non-linear mappings are involved: first, from the input data to the tunable components' scattering characteristics, and, second, from these scattering characteristics to the output wavefront. In this paper, we systematically examine the expressivity of a wave-based physical neural network (WPNN) with structural input encoding. Our analysis is based on a physics-consistent multiport-network model of a compact D-band rich-scattering cavity parametrized by a 100-element programmable metasurface. We separately control encoding non-linearity, structural non-linearity, and network depth in order to examine their interplay, considering a controlled scalar regression task. With phase encoding and strong inter-element mutual coupling (MC), both aforementioned mappings are strongly non-linear and the WPNN performs very well even with a single layer. We further observe that additional layers can partially compensate for weak inter-element MC. In addition, we demonstrate that WPNN depth can improve expressivity even when it is not associated with an increase in trainable weights. Altogether, our results provide a physics-consistent picture of how encoding choice, MC strength, and depth jointly govern the expressive power of PM-based WPNNs, informing design choices for future experimental implementations of WPNNs.