We study the hardness of learning unitary transformations by performing gradient descent on the time parameters of sequences of alternating operators. Such sequences are the basis for the quantum approximate optimization algorithm and represent one of the simplest possible settings for investigating problems of controllability. In general, the loss function landscape of alternating operator sequences in $U(d)$ is highly non-convex, and standard gradient descent can fail to converge to the global minimum in such spaces. In this work, we provide numerical evidence that -- despite the highly non-convex nature of the control landscape -- when the alternating operator sequence contains $d^2$ or more parameters, gradient descent always converges to the target unitary. The rates of convergence provide evidence for a "computational phase transition." When the number of parameters is less than $d^2$, gradient descent converges to a sub-optimal solution. When the number of parameters is greater than $d^2$, gradient descent converges rapidly and exponentially to an optimal solution. At the computational critical point where the number of parameters in the alternating operator sequence equals $d^2$, the rate of convergence is polynomial with a critical exponent of approximately 1.25.