Small-group learning is a broadly used methodology in human learning and shows great effectiveness in improving learning outcomes: a small group of students work together towards the same learning objective, where they express their understanding of a topic to their peers, compare their ideas, and help each other to trouble-shoot problems. We are interested in investigating whether this powerful learning technique can be borrowed from humans to improve the learning abilities of machines. We propose a novel learning approach called small-group learning (SGL). In our approach, each learner uses its intermediately trained model to generate a pseudo-labeled dataset and re-trains its model using pseudo-labeled datasets generated by other learners. We propose a multi-level optimization framework to formulate SGL which involves three learning stages: learners train their network weights independently; learners train their network weights collaboratively via mutual pseudo-labeling; learners improve their architectures by minimizing validation losses. We develop an efficient algorithm to solve the SGL problem. We apply our approach to neural architecture search and achieve significant improvement on CIFAR-100, CIFAR-10, and ImageNet.