Comparison of Memory Chip Organizations vs Reliability in Virtual Memories
Abstract
Conclusions — Random access memory organizations typically are chosen for maximum reliability, based on the operation of the memory box itself without concern for the remainder of the computing system. This had led to widespread use of the 1-bit-per-chip, or related organization which uses error correcting codes to minimize the effects of failures occurring in some basic unit such as a word or double word (32 to 64 bits). Such memory boxes are used quite commonly in paged virtual memory systems where the unit for protection is really a page (4K bytes), or in a cache where the unit for protection is a block (32 to 128 bytes), not a double word. With typical high density memory chips and typical ranges of failure rates, the 1-bit-per-chip organization can often maximize page failures in a virtual memory system. For typical cases, a paged virtual memory using a page-per-chip organization can substantially improve reliability, and is potentially far superior to other organizations. This paper first describes the fundamental considerations of organization for memory systems and demonstrates the underlying problems with a simplified case. Then the reliability in terms of lost pages per megabyte due to hard failures over any time period is analyzed for a paged virtual memory organized in both ways. Normalized curves give the lost pages per Mbyte as a function of failure rate and accumulated time. Assuming reasonable failure rates can be achieved, the page-per-chip organization can be 10 to 20 times more reliable than a 1-bit-per-chip scheme. One specific cache system is also analyzed, and shows similar advantages when organized as 1-block-per-island. The improvement in terms of lost pages or blocks due to chip failure mechanisms is so appreciable in most cases that one can consider eliminating field repair for such failure. For instance, a one Mbyte main memory using 72K bit chips manufactured at the beginning of the learning curve, in a 1-page-per-island organization could lose as little as 2 percent of the pages over the entire life of the machine (100K hours). Even fewer pages would be lost if chips manufactured toward the end of the learning curve were used. In either case, such small losses would have a negligible effect on the performance of a virtual memory system. The 1-bit-per-chip organization would have lost all or nearly all its pages or blocks in this time. Similar comments apply to the use of a cache organized as a block-per-island. While additional work needs to be done to analyze the error rate versus repair strategy, the present results indicate that this new organization deserves serious consideration. Copyright © 1983 by the Institute of Electrical and Electronics Engineers, Inc.