This letter proposes a design of low peak-to-average power ratio (PAPR), low symbol error rate (SER), and high data rate signal for asymmetrically clipped optical orthogonal frequency division multiplexing (ACO-OFDM) systems. The proposed design leverages a variational autoencoder (VAE) incorporating gradual loss learning to jointly optimize the geometry and probability of the constellation's symbols. This not only enhances mutual information (MI) but also effectively reduces the PAPR while maintaining a low SER for reliable transmission. We evaluate the performance of the proposed VAE-based design by comparing the MI, SER, and PAPR against existing techniques. Simulation results demonstrate that the proposed method achieves a considerably lower PAPR while maintaining superior SER and MI performance for a wide range of SNRs.