Recent advancement in next generation reconfigurable antenna and fluid antenna technology has influenced the wireless system with polarization reconfigurable (PR) channels to attract significant attention for promoting beneficial channel condition. We exploit the benefit of PR antennas by integrating such technology into massive multiple-input-multiple-output (MIMO) system. In particular, we aim to jointly design the polarization and beamforming vectors on both transceivers for simultaneous channel reconfiguration and beam alignment, which remarkably enhance the beamforming gain. However, joint optimization over polarization and beamforming vectors without channel state information (CSI) is a challenging task, since depolarization increases the channel dimension; whereas massive MIMO systems typically have low-dimensional pilot measurement from limited radio frequency (RF) chain. This leads to pilot overhead because the transceivers can only observe low-dimensional measurement of the high-dimension channel. This paper pursues the reduction of the pilot overhead in such systems by proposing to employ \emph{interpretable transformer}-based deep learning framework on both transceivers to actively design the polarization and beamforming vectors for pilot stage and transmission stage based on the sequence of accumulated received pilots. Numerical experiments demonstrate the significant performance gain of our proposed framework over the existing non-adaptive and active data-driven methods. Furthermore, we exploit the interpretability of our proposed framework to analyze the learning capabilities of the model.