Over-the-air computation (AirComp) based federated learning (FL) is capable of achieving fast model aggregation by exploiting the waveform superposition property of multiple access channels. However, the model aggregation performance is severely limited by the unfavorable wireless propagation channels. In this paper, we propose to leverage intelligent reflecting surface (IRS) to achieve fast yet reliable model aggregation for AirComp-based FL. To optimize the learning performance, we formulate an optimization problem that jointly optimizes the device selection, the aggregation beamformer at the base station (BS), and the phase shifts at the IRS to maximize the number of devices participating in the model aggregation of each communication round under certain mean-squared-error (MSE) requirements. To tackle the formulated highly-intractable problem, we propose a two-step optimization framework. Specifically, we induce the sparsity of device selection in the first step, followed by solving a series of MSE minimization problems to find the maximum feasible device set in the second step. We then propose an alternating optimization framework, supported by the difference-of-convex-functions programming algorithm for low-rank optimization, to efficiently design the aggregation beamformers at the BS and phase shifts at the IRS. Simulation results will demonstrate that our proposed algorithm and the deployment of an IRS can achieve a lower training loss and higher FL prediction accuracy than the baseline algorithms.