[ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ] [ 14 ] [ 15 ] [ 16 ]

Journal of Information Science and Engineering, Vol. 33 No. 5, pp. 1103-1120

Memory Deduplication: An Effective Approach to Improve the Memory System

1Department of Computer Science
Jinan University
Guangzhou, 510632 P.R. China
E-mail: tyhdeng@jnu.edu.cn; huangxinyu@tisson.cn; 710260037@qq.com; y.t.zhou@foxmail.com

2Key Laboratory of Computer System and Architecture
Chinese Academy of Sciences
Beijing, 100190 P.R. China

3School of Computing
University of Kent
Canterbury, CT2 7NZ, UK
E-mail: frankwang@ieee.org

Programs now have more aggressive demands of memory to hold their data than before. This paper analyzes the characteristics of memory data by using seven real memory traces. It observes that there are a large volume of memory pages with identical contents contained in the traces. Furthermore, the unique memory content accessed are much less than the unique memory address accessed. This is incurred by the traditional address-based cache replacement algorithms that replace memory pages by checking the addresses rather than the contents of those pages, thus resulting in many identical memory contents with different addresses stored in the memory. For example, in the same file system, opening two identical files stored in different directories, or opening two similar files that share a certain amount of contents in the same directory, will result in identical data blocks stored in the cache due to the traditional address-based cache replacement algorithms. Based on the observations, this paper evaluates memory compression and memory deduplication. As expected, memory deduplication greatly outperforms memory compression. For example, the best deduplication ratio is 4.6 times higher than the best compression ratio. The deduplication time and restore time are 121 times and 427 times faster than the compression time and decompression time, respectively. The experimental results in this paper should be able to offer useful insights for designing systems that require abundant memory to improve the system performance.

Keywords: memory deduplication, address-based cache, content-based cache, memory compression, data characteristics

  Retrieve PDF document (JISE_201705_01.pdf)