Hyper-kernels endowed by hyper-Reproducing Kernel Hilbert Space (hyper-RKHS) formulate the kernel learning task as learning on the space of kernels itself, which provides significant model flexibility for kernel learning with outstanding performance in real-world applications. However, the convergence behavior of these learning algorithms in hyper-RKHS has not been investigated in learning theory. In this paper, we conduct approximation analysis of kernel ridge regression (KRR) and support vector regression (SVR) in this space. To the best of our knowledge, this is the first work to study the approximation performance of regression in hyper-RKHS. For applications, we propose a general kernel learning framework conducted by the introduced two regression models to deal with the out-of-sample extensions problem, i.e., to learn a underlying general kernel from the pre-given kernel/similarity matrix in hyper-RKHS. Experimental results on several benchmark datasets suggest that our methods are able to learn a general kernel function from an arbitrary given kernel matrix.