In this paper, we study a class of stochastic optimization problems, referred to as the \emph{Conditional Stochastic Optimization} (CSO), in the form of $\min_{x \in \mathcal{X}} \mathbb{E}_{\xi}f_\xi\Big({\mathbb{E}_{\eta|\xi}[\mathbf{g}_\eta(x,\xi)]}\Big)$. CSO finds a wide spectrum of applications including portfolio selection, reinforcement learning, robust and invariant learning. We establish the sample complexity of the sample average approximation (SAA) for CSO, under a variety of structural assumptions, such as Lipschitz continuity, smoothness, and error bound conditions. We show that the total sample complexity improves from $\mathcal{O}(d/\epsilon^4)$ to $\mathcal{O}(d/\epsilon^3)$ when assuming smoothness of the outer function, and further to $\mathcal{O}(1/\epsilon^2)$ when the empirical function satisfies the quadratic growth condition. We also establish the sample complexity of a modified SAA, when $\xi$ and $\eta$ are independent. Our numerical results from several experiments further support our theoretical findings. Keywords: stochastic optimization, sample average approximation, large deviations theory