Three-dimensional (3D) multi-slab imaging is a promising approach for high-resolution in vivo diffusion MRI (dMRI) due to its compatibility with short TR (1-2 s), providing optimal signal-to-noise ratio (SNR) efficiency. A major challenge, however, is slab boundary artifacts arising from non-ideal slab-selective RF excitation. Non-rectangular slab profiles reduce signal intensity at slab boundaries, while profile overlap across adjacent slabs introduces inter-slab crosstalk, where repeated excitation shortens the local TR and limits T1 recovery. To mitigate slab boundary artifacts without increasing scan time, we build on slab profile encoding and propose Slab-shifting for Harmonized 3D Acquisition and Reconstruction with Profile Encoding Networks (SHARPEN). For different diffusion directions, SHARPEN applies inter-volume field-of-view shifts along the slice direction to provide complementary slab profile encoding without prolonging acquisition. Slab profiles are estimated using a lightweight self-supervised neural network that exploits consistency across shifted acquisitions and known physical properties of slab profiles and diffusion images, and corrected images are reconstructed accordingly. SHARPEN was validated using simulated and prospectively acquired high-resolution in vivo data and demonstrates accurate slab profile estimation and robust boundary artifact correction, even in the presence of inter-volume motion. SHARPEN does not require high-quality reference training data and supports subject-specific training. Its efficient GPU-based implementation delivers faster and more accurate correction than NPEN, yielding slice-wise quantitative profiles that closely match those from reference 2D acquisitions. SHARPEN enables high-quality dMRI at 0.7 mm isotropic resolution on a 3T clinical scanner, highlighting its potential to advance submillimeter dMRI for neuroscience research.