Application-directed Cache Coherence Design

Application-directed Cache Coherence Design
Title Application-directed Cache Coherence Design PDF eBook
Author Hongzhou Zhao
Publisher
Pages 153
Release 2013
Genre
ISBN

Download Application-directed Cache Coherence Design Book in PDF, Epub and Kindle

"Chip multiprocessors continue to provide programmers with a coherent view of shared memory in hardware across all cores. At large core counts, maintaining coherence in hardware across cached copies of data is a challenge due to bandwidth and metadata storage consumption. A cache block is the basic unit for data storage and communication, chosen at design time to match average locality across a range of applications. Conventional hardware implements the coherence protocol using a fixed granularity (of a cache block) for all coherence operations. Coherence metadata is recorded for every cache block, and coherence permissions are also granted in cache block units. Metadata is typically proportional both to the number of cores and the amount of data cached. Empirical analysis shows that applications typically exhibit a small number of sharing patterns, resulting in redundant information in the metadata. Similarly, considerable bandwidth is wasted due to a mismatch between application access granularity and the fixed granularity data and coherence communication. This dissertation leverages the inherent patterns of data access and sharing behavior in applications to design protocols that eliminate the bandwidth and metadata storage waste in conventional coherence protocols. The sharing pattern-aware directory designs, which we call SPACE and SPATL, recognize and represent only one copy of the subset of sharing patterns exhibited at any given instant in an application. The resulting protocols eliminate the linear proportionality of metadata storage to the number of cores. The adaptive coherence granularity designs, which we call Protozoa, match data movement to an application's spatial locality and access behavior, supporting fine granularity sharing without increasing metadata storage needs. The application-directed approach allows bandwidth needs to track inherent application access and sharing behavior"--Page vii-viii.

Hardware and Compiler-directed Cache Coherence in Large-scale Multiprocessors

Hardware and Compiler-directed Cache Coherence in Large-scale Multiprocessors
Title Hardware and Compiler-directed Cache Coherence in Large-scale Multiprocessors PDF eBook
Author Lynn Choi
Publisher
Pages 40
Release 1996
Genre Cache memory
ISBN

Download Hardware and Compiler-directed Cache Coherence in Large-scale Multiprocessors Book in PDF, Epub and Kindle

Abstract: "In this paper, we study a hardware-supported, compiler-directed (HSCD) cache coherence scheme, which can be implemented on a large-scale multiprocessor using off-the-shelf microprocessors, such as the Cray T3D. The scheme can be adapted to various cache organizations, including multi-word cache lines and byte-addressable architectures. Several system related issues, including critical sections, inter-thread communication, and task migration have also been addressed. The cost of the required hardware support is minimal and proportional to the cache size. The necessary compiler algorithms, including intra- and interprocedural array data flow analysis, have been implemented on the Polaris parallelizing compiler [33]. From our simulation study using the Perfect Club benchmarks [5], we found that in spite of the conservative analysis made by the compiler, the performance of the proposed HSCD scheme can be comparable to that of a full-map hardware directory scheme. Given its comparable performance and reduced hardware cost, the proposed scheme can be a viable alternative for large-scale multiprocessors such as the Cray T3D, which rely on users to maintain data coherence."

Hardware and Compiler-directed Cache Coherence in Large-scale Multiprocessors: Design Considerations and Preformance Study

Hardware and Compiler-directed Cache Coherence in Large-scale Multiprocessors: Design Considerations and Preformance Study
Title Hardware and Compiler-directed Cache Coherence in Large-scale Multiprocessors: Design Considerations and Preformance Study PDF eBook
Author L. Choi
Publisher
Pages 37
Release 1996
Genre
ISBN

Download Hardware and Compiler-directed Cache Coherence in Large-scale Multiprocessors: Design Considerations and Preformance Study Book in PDF, Epub and Kindle

Cache and Memory Hierarchy Design

Cache and Memory Hierarchy Design
Title Cache and Memory Hierarchy Design PDF eBook
Author Steven A. Przybylski
Publisher Elsevier
Pages 238
Release 2014-06-28
Genre Computers
ISBN 0080500595

Download Cache and Memory Hierarchy Design Book in PDF, Epub and Kindle

An authoritative book for hardware and software designers. Caches are by far the simplest and most effective mechanism for improving computer performance. This innovative book exposes the characteristics of performance-optimal single and multi-level cache hierarchies by approaching the cache design process through the novel perspective of minimizing execution times. It presents useful data on the relative performance of a wide spectrum of machines and offers empirical and analytical evaluations of the underlying phenomena. This book will help computer professionals appreciate the impact of caches and enable designers to maximize performance given particular implementation constraints.

A Primer on Memory Consistency and Cache Coherence

A Primer on Memory Consistency and Cache Coherence
Title A Primer on Memory Consistency and Cache Coherence PDF eBook
Author Daniel Sorin
Publisher Morgan & Claypool Publishers
Pages 214
Release 2011-03-02
Genre Technology & Engineering
ISBN 1608455653

Download A Primer on Memory Consistency and Cache Coherence Book in PDF, Epub and Kindle

Many modern computer systems and most multicore chips (chip multiprocessors) support shared memory in hardware. In a shared memory system, each of the processor cores may read and write to a single shared address space. For a shared memory machine, the memory consistency model defines the architecturally visible behavior of its memory system. Consistency definitions provide rules about loads and stores (or memory reads and writes) and how they act upon memory. As part of supporting a memory consistency model, many machines also provide cache coherence protocols that ensure that multiple cached copies of data are kept up-to-date. The goal of this primer is to provide readers with a basic understanding of consistency and coherence. This understanding includes both the issues that must be solved as well as a variety of solutions. We present both highlevel concepts as well as specific, concrete examples from real-world systems. Table of Contents: Preface / Introduction to Consistency and Coherence / Coherence Basics / Memory Consistency Motivation and Sequential Consistency / Total Store Order and the x86 Memory Model / Relaxed Memory Consistency / Coherence Protocols / Snooping Coherence Protocols / Directory Coherence Protocols / Advanced Topics in Coherence / Author Biographies

Memory Consistency Directed Cache Coherence Protocols for Scalable Multiprocessors

Memory Consistency Directed Cache Coherence Protocols for Scalable Multiprocessors
Title Memory Consistency Directed Cache Coherence Protocols for Scalable Multiprocessors PDF eBook
Author Marco Iskender Elver
Publisher
Pages
Release 2016
Genre
ISBN

Download Memory Consistency Directed Cache Coherence Protocols for Scalable Multiprocessors Book in PDF, Epub and Kindle

A Primer on Memory Consistency and Cache Coherence, Second Edition

A Primer on Memory Consistency and Cache Coherence, Second Edition
Title A Primer on Memory Consistency and Cache Coherence, Second Edition PDF eBook
Author Vijay Nagarajan
Publisher Springer Nature
Pages 276
Release 2022-05-31
Genre Technology & Engineering
ISBN 3031017641

Download A Primer on Memory Consistency and Cache Coherence, Second Edition Book in PDF, Epub and Kindle

Many modern computer systems, including homogeneous and heterogeneous architectures, support shared memory in hardware. In a shared memory system, each of the processor cores may read and write to a single shared address space. For a shared memory machine, the memory consistency model defines the architecturally visible behavior of its memory system. Consistency definitions provide rules about loads and stores (or memory reads and writes) and how they act upon memory. As part of supporting a memory consistency model, many machines also provide cache coherence protocols that ensure that multiple cached copies of data are kept up-to-date. The goal of this primer is to provide readers with a basic understanding of consistency and coherence. This understanding includes both the issues that must be solved as well as a variety of solutions. We present both high-level concepts as well as specific, concrete examples from real-world systems. This second edition reflects a decade of advancements since the first edition and includes, among other more modest changes, two new chapters: one on consistency and coherence for non-CPU accelerators (with a focus on GPUs) and one that points to formal work and tools on consistency and coherence.