The replica-exchange Monte-Carlo (RXMC) method is a powerful Markov-chain Monte-Carlo algorithm for sampling from multi-modal distributions, which are challenging for conventional methods. The sampling efficiency of the RXMC method depends highly on the selection of the temperatures, and finding optimal temperatures remains a challenge. In this study, we propose a refined online temperature selection method by extending the gradient-based optimization framework proposed previously. Building upon the existing temperature update approach, we introduce a reparameterization technique to strictly enforce physical constraints, such as the monotonic ordering of inverse temperatures, which were not explicitly addressed in the original formulation. The proposed method defines the variance of acceptance rates between adjacent replicas as a loss function, estimates its gradient using differential information from the sampling process, and optimizes the temperatures via gradient descent. We demonstrate the effectiveness of our method through experiments on benchmark spin systems, including the two-dimensional ferromagnetic Ising model, the two-dimensional ferromagnetic XY model, and the three-dimensional Edwards-Anderson model. Our results show that the method successfully achieves uniform acceptance rates and reduces round-trip times across the temperature space. Furthermore, our proposed method offers a significant advantage over recently proposed policy gradient method that require careful hyperparameter tuning, while simultaneously preventing the constraint violations that destabilize optimization.