The objective of a community detection algorithm is to group similar nodes in a network into communities, while increasing the dissimilarity between them. Several methods have been proposed but many of them are not suitable for large-scale networks because they have high complexity and use global knowledge. The Label Propagation Algorithm (LPA) assigns a unique label to every node and propagates the labels locally, while applying the majority rule to reach a consensus. Nodes which share the same label are then grouped into communities. Although LPA excels with near linear execution time, it gets easily stuck in local optima and often returns a single giant community. To overcome these problems we propose MemLPA, a novel LPA where each node implements memory and the decision rule takes past states of the network into account. We demonstrate through extensive experiments on the Lancichinetti-Fortunato-Radicchi benchmark and a set of real-world networks that MemLPA outperforms most of state-of-the-art community detection algorithms.