Deep Neural Networks (DNNs) are deployed in many real-time and safety-critical applications such as autonomous vehicles and medical diagnosis. In such applications, quantization is used to compress the model for storage and computation reduction. However, recent research has shown that faults in memory can cause a significant drop in DNN accuracy and conventional quantization methods focus only on model compression. This paper proposes a novel method that performs model quantization while remarkably improving the fault-tolerance of the model. It can be incorporated with other hardware approaches such as Error Correcting Code to further improve fault-tolerance. The proposed method reduces possible error patterns that negatively impact classification accuracy by modifying weight distributions and applying a novel masking-based clipping function. Experimental results show that the proposed method enhances the fault-tolerance of the quantized DNN, which can tolerate 1803× higher bit error rates than the conventional method.