A key step to driver safety is to observe the driver's activities with the face being a key step in this process to extracting information such as head pose, blink rate, yawns, talking to passenger which can then help derive higher level information such as distraction, drowsiness, intent, and where they are looking. In the context of driving safety, it is important for the system perform robust estimation under harsh lighting and occlusion but also be able to detect when the occlusion occurs so that information predicted from occluded parts of the face can be taken into account properly. This paper introduces the Occluded Stacked Hourglass, based on the work of original Stacked Hourglass network for body pose joint estimation, which is retrained to process a detected face window and output 68 occlusion heat maps, each corresponding to a facial landmark. Landmark location, occlusion levels and a refined face detection score, to reject false positives, are extracted from these heat maps. Using the facial landmark locations, features such as head pose and eye/mouth openness can be extracted to derive driver attention and activity. The system is evaluated for face detection, head pose, and occlusion estimation on various datasets in the wild, both quantitatively and qualitatively, and shows state-of-the-art results.