Directory Based Cache Coherency, Organization, Operations and Challenges in Implementation - Study

Directory Based Cache Coherency, Organization, Operations and Challenges in Implementation - Study
Title Directory Based Cache Coherency, Organization, Operations and Challenges in Implementation - Study PDF eBook
Author Subrahmanya Bhat
Publisher
Pages 0
Release 2017
Genre
ISBN

Download Directory Based Cache Coherency, Organization, Operations and Challenges in Implementation - Study Book in PDF, Epub and Kindle

Today's systems are designed with Multi Core Architecture. The idea behind this is to achieve high system throuput. Once the Processor clock speed reached its saturation, designers opted for having multiple cores. Each Core or Processor equipped with their own private cache memory. But under Chip Multiprocessor, where all the processor have access to shared memory, having respective cache memory will result with Cache Coherency Problem. In Directory Protocol, for each block of data there is a directory entry that contains a number of pointers. The purpose of this number is to mention the locations of block copies. The important advantage of directory based protocols is that they scale much better than snoopy protocols. In addition to this it has the advantage of ability to exploit arbitrary point-to-point interconnects. But mean time it also has the overhead in terms of the storage and manipulation of directory state. This paper discus different Directory Based implementation, operations along with and its implementation difficulties.

Design Issues and Their Performance Impact in Systems with Directory-based Caches

Design Issues and Their Performance Impact in Systems with Directory-based Caches
Title Design Issues and Their Performance Impact in Systems with Directory-based Caches PDF eBook
Author University of Illinois at Urbana-Champaign. Center for Supercomputing Research and Development
Publisher
Pages 33
Release 1992
Genre Cache memory
ISBN

Download Design Issues and Their Performance Impact in Systems with Directory-based Caches Book in PDF, Epub and Kindle

Abstract: "Directory schemes have been proposed to solve the cache coherence problem for large-scale multiprocessor systems. Most of previous studies concentrated on cost reduction for the design of directory schemes. With scalable directory design, there are various design parameters that affect its performance. Their impact is impossible to predict. In this paper, we evaluate the effect of these parameters on the performance of directory schemes concentrating on shared data, including cache organization, directory protocols, scalability and memory latency. We also analyze the resource contention and coherence delays of directory schemes and discuss possible improvements."

Locality-aware Cache Hierarchy Management for Multicore Processors

Locality-aware Cache Hierarchy Management for Multicore Processors
Title Locality-aware Cache Hierarchy Management for Multicore Processors PDF eBook
Author
Publisher
Pages 194
Release 2015
Genre
ISBN

Download Locality-aware Cache Hierarchy Management for Multicore Processors Book in PDF, Epub and Kindle

Next generation multicore processors and applications will operate on massive data with significant sharing. A major challenge in their implementation is the storage requirement for tracking the sharers of data. The bit overhead for such storage scales quadratically with the number of cores in conventional directory-based cache coherence protocols. Another major challenge is limited cache capacity and the data movement incurred by conventional cache hierarchy organizations when dealing with massive data scales. These two factors impact memory access latency and energy consumption adversely. This thesis proposes scalable efficient mechanisms that improve effective cache capacity (i.e., by improving utilization) and reduce data movement by exploiting locality and controlling replication. First, a limited directory-based protocol, ACKwise is proposed to track the sharers of data in a cost-effective manner. ACKwise leverages broadcasts to implement scalable cache coherence. Broadcast support can be implemented in a 2-D mesh network by making simple changes to its routing policy without requiring any additional virtual channels. Second, a locality-aware replication scheme that better manages the private caches is proposed. This scheme controls replication based on data reuse information and seamlessly adapts between private and logically shared caching of on-chip data at the fine granularity of cache lines. A low-overhead runtime profiling capability to measure the locality of each cache line is built into hardware. Private caching is only allowed for data blocks with high spatio-temporal locality. Third, a Timestamp-based memory ordering validation scheme is proposed that enables the locality-aware private cache replication scheme to be implementable in processors with out-of-order memory that employ popular memory consistency models. This method does not rely on cache coherence messages to detect speculation violations, and hence is applicable to the locality-aware protocol. The timestamp mechanism is efficient due to the observation that consistency violations only occur due to conflicting accesses that have temporal proximity (i.e., within a few cycles of each other), thus requiring timestamps to be stored only for a small time window. Fourth, a locality-aware last-level cache (LLC) replication scheme that better manages the LLC is proposed. This scheme adapts replication at runtime based on fine-grained cache line reuse information and thereby, balances data locality and off-chip miss rate for optimized execution. Finally, all the above schemes are combined to obtain a cache hierarchy replication scheme that provides optimal data locality and miss rates at all levels of the cache hierarchy. The design of this scheme is motivated by the experimental observation that both locality-aware private cache & LLC replication enable varying performance improvements across benchmarks. These techniques enable optimal use of the on-chip cache capacity, and provide low-latency, low-energy memory access, while retaining the convenience of shared memory and preserving the same memory consistency model. On a 64-core multicore processor with out-of-order cores, Locality-aware Cache Hierarchy Replication improves completion time by 15% and energy by 22% over a state-of-the-art baseline while incurring a storage overhead of 30.7 KB per core. (i.e., 10% the aggregate cache capacity of each core).

A Primer on Memory Consistency and Cache Coherence

A Primer on Memory Consistency and Cache Coherence
Title A Primer on Memory Consistency and Cache Coherence PDF eBook
Author Vijay Nagarajan
Publisher Morgan & Claypool Publishers
Pages 296
Release 2020-02-04
Genre Computers
ISBN 1681737108

Download A Primer on Memory Consistency and Cache Coherence Book in PDF, Epub and Kindle

Many modern computer systems, including homogeneous and heterogeneous architectures, support shared memory in hardware. In a shared memory system, each of the processor cores may read and write to a single shared address space. For a shared memory machine, the memory consistency model defines the architecturally visible behavior of its memory system. Consistency definitions provide rules about loads and stores (or memory reads and writes) and how they act upon memory. As part of supporting a memory consistency model, many machines also provide cache coherence protocols that ensure that multiple cached copies of data are kept up-to-date. The goal of this primer is to provide readers with a basic understanding of consistency and coherence. This understanding includes both the issues that must be solved as well as a variety of solutions. We present both high-level concepts as well as specific, concrete examples from real-world systems. This second edition reflects a decade of advancements since the first edition and includes, among other more modest changes, two new chapters: one on consistency and coherence for non-CPU accelerators (with a focus on GPUs) and one that points to formal work and tools on consistency and coherence.

Parallel Computer Architecture

Parallel Computer Architecture
Title Parallel Computer Architecture PDF eBook
Author David Culler
Publisher Gulf Professional Publishing
Pages 1056
Release 1999
Genre Computers
ISBN 1558603433

Download Parallel Computer Architecture Book in PDF, Epub and Kindle

This book outlines a set of issues that are critical to all of parallel architecture--communication latency, communication bandwidth, and coordination of cooperative work (across modern designs). It describes the set of techniques available in hardware and in software to address each issues and explore how the various techniques interact.

Implementing a Directory-based Cache Consistency Protocol

Implementing a Directory-based Cache Consistency Protocol
Title Implementing a Directory-based Cache Consistency Protocol PDF eBook
Author Stanford University. Computer Systems Laboratory
Publisher
Pages 40
Release 1990
Genre Computer architecture
ISBN

Download Implementing a Directory-based Cache Consistency Protocol Book in PDF, Epub and Kindle

Directory-based cache consistency protocols have the potential to allow shared-memory multiprocessors to scale to a large number of processors. While many variations of these coherence schemes exist in the literature, they have typically been described at a rather high level, making adequate evaluation difficult. This paper explores the implementation issues of directory-based coherency strategies by developing a design at the level of detail needed to write a memory system functional simulator with an accurate timing model. The paper presents the design of both an invalidation coherency protocol and the associated directory/memory hardware. Support is added to prevent deadlock, handle subtle consistency situations, and implement a proper programming model of multiprocess execution. Extensions are delineated for realizing a multiple-threaded directory that can continue to process commands while waiting for a reply from a cache. The final hardware design is evaluated in the context of the number of parts required for implementation.

A Class of Directory-based Cache Coherence Protocols

A Class of Directory-based Cache Coherence Protocols
Title A Class of Directory-based Cache Coherence Protocols PDF eBook
Author
Publisher
Pages
Release 1993
Genre
ISBN

Download A Class of Directory-based Cache Coherence Protocols Book in PDF, Epub and Kindle