The last several years have seen significant progress in using depth cameras for tracking articulated objects such as human bodies, hands, and robotic manipulators. Most approaches focus on tracking skeletal parameters of a fixed shape model, which makes them insufficient for applications that require accurate estimates of deformable object surfaces. To overcome this limitation, we present a 3D model-based tracking system for articulated deformable objects. Our system is able to track human body pose and high resolution surface contours in real time using a commodity depth sensor and GPU hardware. We implement this as a joint optimization over a skeleton to account for changes in pose, and over the vertices of a high resolution mesh to track the subject's shape. Through experimental results we show that we are able to capture dynamic sub-centimeter surface detail such as folds and wrinkles in clothing. We also show that this shape estimation aids kinematic pose estimation by providing a more accurate target to match against the point cloud. The end result is highly accurate spatiotemporal and semantic information which is well suited for physical human robot interaction as well as virtual and augmented reality systems.