Real-time music tracking systems follow a musical performance and at any time report the current position in a corresponding score. Most existing methods approach this problem exclusively in the audio domain, typically using online time warping (OLTW) techniques on incoming audio and an audio representation of the score. Audio OLTW techniques have seen incremental improvements both in features and model heuristics which reached a performance plateau in the past ten years. We argue that converting and representing the performance in the symbolic domain -- thereby transforming music tracking into a symbolic task -- can be a more effective approach, even when the domain transformation is imperfect. Our music tracking system combines two real-time components: one handling audio-to-note transcription and the other a novel symbol-level tracker between transcribed input and score. We compare the performance of this mixed audio-symbolic approach with its equivalent audio-only counterpart, and demonstrate that our method outperforms the latter in terms of both precision, i.e., absolute tracking error, and robustness, i.e., tracking success.