Performance Analysis of a Directory-based Cache Coherence Scheme Using a Simulation Model
Title | Performance Analysis of a Directory-based Cache Coherence Scheme Using a Simulation Model PDF eBook |
Author | Sundararajah Muruganujan |
Publisher | |
Pages | 74 |
Release | 1993 |
Genre | |
ISBN |
Performance Evaluation of Directory-Based Cache Coherence Protocols
Title | Performance Evaluation of Directory-Based Cache Coherence Protocols PDF eBook |
Author | Ipek Abasikeles |
Publisher | LAP Lambert Academic Publishing |
Pages | 72 |
Release | 2011-04 |
Genre | |
ISBN | 9783844391879 |
The performance of three directory-based cache- coherence protocols; strict request-response, intervention forwarding and reply forwarding are evaluated via simulation on the SOME-Bus, which is a fiber-optic interconnection network supporting DSM. The simulated system contains 64 nodes, each of which has a processor, cache controller, directory controller and output channel. Simulations have been conducted for each protocol to measure average processor utilization and average network latency for varying values of DSM parameters such as the ratio of the mean channel service time to mean thread run time (T/R), probability of a cache block being in modified state {P(M)}, the fraction of write misses {P(W)} and under different traffic patterns. The results reveal that the performance of all protocols decreases under all traffic patterns as P(W), P(M) or T/R increases. The effect of P(W) on the performance of the protocols reduces as P(M) increases. Reply forwarding performs the best for high P(M) values, intervention forwarding yields the best performance for low P(M) and high P(W) values and strict request-response is the best protocol under hot-region (HR) traffic.
Performance and Scalability Aspects of Directory-based Cache Coherence in Shared-memory Multiprocessors
Title | Performance and Scalability Aspects of Directory-based Cache Coherence in Shared-memory Multiprocessors PDF eBook |
Author | |
Publisher | |
Pages | 6 |
Release | 1993 |
Genre | |
ISBN |
We present a study that accentuates the performance and scalability aspects of directory-based cache coherence in multiprocessor systems. Using a multiprocessor with a software-based coherence scheme, efficient implementations rely heavily on the programmer's ability to explicitly manage the memory system, which is typically handled by hardware support on other bus-based, shared memory multiprocessors. We describe a scalable, shared memory, cache coherent multiprocessor and present simulation results obtained on three parallel programs. This multiprocessor configuration exhibits high performance at no additional parallel programming cost.
Comparison and Analysis of Software and Directory
Title | Comparison and Analysis of Software and Directory PDF eBook |
Author | Yung-Chin Chen |
Publisher | |
Pages | 29 |
Release | 1990 |
Genre | Cache memory |
ISBN |
Abstract: "Directory schemes and software schemes have been proposed to solve the cache coherence problem for the MIN-based large-scale multiprocessor system. We compare the performance of the two schemes using trace-driven simulation including the effect of false sharing caused by a nontrivial cache line size. It shows that the simplest software scheme can have a hit ratio and shared memory traffic comparable to those of the directory scheme. The invalidations and the sharing behavior of the directory scheme are classified and analyzed. The influence of the scheduling algorithm, the directory scheme protocol, and the line size on the invalidation distribution is discussed."
LC-Sim: a Simulation Framework for Evaluating Location Consistency Based Cache Protocols
Title | LC-Sim: a Simulation Framework for Evaluating Location Consistency Based Cache Protocols PDF eBook |
Author | Pouya Fotouhi |
Publisher | |
Pages | 60 |
Release | 2017 |
Genre | |
ISBN | 9780355251784 |
New high-performance processors tend to shift from multi to many cores. More- over, shared memory has turned to dominant paradigm for mainstream multicore pro- cessors. As memory wall issue loomed over architecture design, most modern computer systems have several layers in their memory hierarchy. Among many, caches has be- come everlasting components of memory hierarchies as they signicantly reduce access time by taking the advantage of locality. ☐ Major processor vendors usually rely on cache coherence, and implement a vari- ant of MESI, e.g., MOESI for AMD, to help reduce inter-chip trac on the fast in- terconnection network. Supposedly, maintaining coherence should help with keeping parallel and concurrent programmers happy, all the while providing them with a well- known cache behavior for shared memory. This thesis challenge the assumption that Coherence is well-suited for large-scale many core processors. Seeking an alternative for coherence, LC cache protocol is extensively investigated. ☐ LC-Cache is a cache protocol weaker than Coherence, but which preserves causality. It relies on the Location Consistency (LC) model. The basic philosophy behind LC is to maintain a unique view of memory only if there is a reason to. Other ordinary memory accesses may be observed in any order by the other processors of the computer system. ☐ The motivation to stand against cache coherence, relies on underestimated lim- itations implied on system design by coherence. Observations presented in this thesis, demonstrates that coherence eliminates the possibility of having a directory based pro- tocol in practice since size of such directory grows linearly with number of cores. In addition, coherence adds implicit latency in many cases to the protocol. ☐ This thesis presents LCCSim, a simulation framework to compare cache proto- col based on location consistency against cache coherence protocols. A comparative analysis between the MESI and MOESI coherence protocols is provided, and pit them against LC-Cache. Both MESI and MOESI consistently generate more on-chip trac compared to LC cache since transitions in LC cache are done locally. However, LC cache degrades total latency of accesses as it does not take the advantage of cache to cache forwarding. Additionally, LC cache cannot be considered a true implementation based on LC since it does not behave according to the memory model. The following summarizes the contributions of this thesis: 1.Detailed specication of LC cache protocol, covering the missing aspects in the original paper. 2.A simulation framework to compare cache protocols based on LC against cache coherence protocols. 3.Extensive analysis of LC cache protocol, leading to discovery of several weak- nesses. 4.Demonstrating features for an ecient cache protocol, truly based on location consistency.
Analysis of directory based cache coherence schemes with multistage networks
Title | Analysis of directory based cache coherence schemes with multistage networks PDF eBook |
Author | A. K. Nanda |
Publisher | |
Pages | 50 |
Release | 1991 |
Genre | Cache memory |
ISBN |
Hardware and Compiler-directed Cache Coherence in Large-scale Multiprocessors
Title | Hardware and Compiler-directed Cache Coherence in Large-scale Multiprocessors PDF eBook |
Author | Lynn Choi |
Publisher | |
Pages | 40 |
Release | 1996 |
Genre | Cache memory |
ISBN |
Abstract: "In this paper, we study a hardware-supported, compiler-directed (HSCD) cache coherence scheme, which can be implemented on a large-scale multiprocessor using off-the-shelf microprocessors, such as the Cray T3D. The scheme can be adapted to various cache organizations, including multi-word cache lines and byte-addressable architectures. Several system related issues, including critical sections, inter-thread communication, and task migration have also been addressed. The cost of the required hardware support is minimal and proportional to the cache size. The necessary compiler algorithms, including intra- and interprocedural array data flow analysis, have been implemented on the Polaris parallelizing compiler [33]. From our simulation study using the Perfect Club benchmarks [5], we found that in spite of the conservative analysis made by the compiler, the performance of the proposed HSCD scheme can be comparable to that of a full-map hardware directory scheme. Given its comparable performance and reduced hardware cost, the proposed scheme can be a viable alternative for large-scale multiprocessors such as the Cray T3D, which rely on users to maintain data coherence."