End-to-end networks trained for task-oriented dialog, such as for recommending restaurants to a user, suffer from out-of-vocabulary (OOV) problem -- the entities in the Knowledge Base (KB) may not be seen by the network at training time, making it hard to use them in dialog. We propose a novel Hierarchical Pointer Generator Memory Network (HyP-MN), in which the next word may be generated from the decode vocabulary or copied from a hierarchical memory maintaining KB results and previous utterances. This hierarchical memory layout along with a novel KB dropout helps to alleviate the OOV problem. Evaluating over the dialog bAbI tasks, we find that HyP-MN outperforms state-of-the-art results, with considerable improvements (10% on OOV test set). HyP-MN also achieves competitive performances on various real-world datasets such as CamRest676 and In-car assistant dataset.