In this research paper, I will elaborate on a method to evaluate machine translation models based on their performance on underlying syntactical phenomena between English and Arabic languages. This method is especially important as such "neural" and "machine learning" are hard to fine-tune and change. Thus, finding a way to evaluate them easily and diversely would greatly help the task of bettering them.