Energy Efficient High Performance Processors

Energy Efficient High Performance Processors
Title Energy Efficient High Performance Processors PDF eBook
Author Jawad Haj-Yahya
Publisher Springer
Pages 176
Release 2018-03-22
Genre Technology & Engineering
ISBN 9811085544

Download Energy Efficient High Performance Processors Book in PDF, Epub and Kindle

This book explores energy efficiency techniques for high-performance computing (HPC) systems using power-management methods. Adopting a step-by-step approach, it describes power-management flows, algorithms and mechanism that are employed in modern processors such as Intel Sandy Bridge, Haswell, Skylake and other architectures (e.g. ARM). Further, it includes practical examples and recent studies demonstrating how modem processors dynamically manage wide power ranges, from a few milliwatts in the lowest idle power state, to tens of watts in turbo state. Moreover, the book explains how thermal and power deliveries are managed in the context this huge power range. The book also discusses the different metrics for energy efficiency, presents several methods and applications of the power and energy estimation, and shows how by using innovative power estimation methods and new algorithms modern processors are able to optimize metrics such as power, energy, and performance. Different power estimation tools are presented, including tools that break down the power consumption of modern processors at sub-processor core/thread granularity. The book also investigates software, firmware and hardware coordination methods of reducing power consumption, for example a compiler-assisted power management method to overcome power excursions. Lastly, it examines firmware algorithms for dynamic cache resizing and dynamic voltage and frequency scaling (DVFS) for memory sub-systems.

Energy-Efficient High Performance Computing

Energy-Efficient High Performance Computing
Title Energy-Efficient High Performance Computing PDF eBook
Author James H. Laros III
Publisher Springer
Pages 0
Release 2012-09-04
Genre Computers
ISBN 9781447144915

Download Energy-Efficient High Performance Computing Book in PDF, Epub and Kindle

In this work, the unique power measurement capabilities of the Cray XT architecture were exploited to gain an understanding of power and energy use, and the effects of tuning both CPU and network bandwidth. Modifications were made to deterministically halt cores when idle. Additionally, capabilities were added to alter operating P-state. At the application level, an understanding of the power requirements of a range of important DOE/NNSA production scientific computing applications running at large scale is gained by simultaneously collecting current and voltage measurements on the hosting nodes. The effects of both CPU and network bandwidth tuning are examined, and energy savings opportunities without impact on run-time performance are demonstrated. This research suggests that next-generation large-scale platforms should not only approach CPU frequency scaling differently, but could also benefit from the capability to tune other platform components to achieve more energy-efficient performance.

Energy Efficient Microprocessor Design

Energy Efficient Microprocessor Design
Title Energy Efficient Microprocessor Design PDF eBook
Author Thomas D. Burd
Publisher Springer Science & Business Media
Pages 384
Release 2002
Genre Computers
ISBN 9780792375869

Download Energy Efficient Microprocessor Design Book in PDF, Epub and Kindle

This volume starts with a description of the metrics and benchmarks used to design energy-efficient microprocessor systems, followed by energy-efficient methodologies for the architecture and circuit design, DC-DC conversion, energy-efficient software and system integration.

High-Performance Energy-Efficient Microprocessor Design

High-Performance Energy-Efficient Microprocessor Design
Title High-Performance Energy-Efficient Microprocessor Design PDF eBook
Author Vojin G. Oklobdzija
Publisher Springer Science & Business Media
Pages 342
Release 2007-04-27
Genre Technology & Engineering
ISBN 0387340475

Download High-Performance Energy-Efficient Microprocessor Design Book in PDF, Epub and Kindle

Written by the world’s most prominent microprocessor design leaders from industry and academia, this book provides complete coverage of all aspects of complex microprocessor design: technology, power management, clocking, high-performance architecture, design methodologies, memory and I/O design, computer aided design, testing and design for testability. The chapters provide state-of-the-art knowledge while including sufficient tutorial material to bring non-experts up to speed. A useful companion to design engineers working in related areas.

High Energy Efficiency Neural Network Processor with Combined Digital and Computing-in-Memory Architecture

High Energy Efficiency Neural Network Processor with Combined Digital and Computing-in-Memory Architecture
Title High Energy Efficiency Neural Network Processor with Combined Digital and Computing-in-Memory Architecture PDF eBook
Author Jinshan Yue
Publisher Springer Nature
Pages 128
Release
Genre
ISBN 9819734770

Download High Energy Efficiency Neural Network Processor with Combined Digital and Computing-in-Memory Architecture Book in PDF, Epub and Kindle

Energy-efficient Computing with Fine-grained Many-core Systems

Energy-efficient Computing with Fine-grained Many-core Systems
Title Energy-efficient Computing with Fine-grained Many-core Systems PDF eBook
Author Bin Liu
Publisher
Pages
Release 2016
Genre
ISBN 9781369615579

Download Energy-efficient Computing with Fine-grained Many-core Systems Book in PDF, Epub and Kindle

For the past half century, Moore's Law has been the fundamental driver of high-performance computing. The continued CMOS technology scaling doubles the transistor density of VLSI systems and had provided a predictable 40% performance improvement of single-core processors for every 18 to 24 months. However, as Dennard Scaling ends, the era of scaling frequency and performance without increasing power density is over. Since 2005, the semiconductor industry shifted to multi-core and many-core processors in order to sustain the proportional scaling of performance along with transistor count increases. One of the critical challenges for many-core system design is to reduce the power dissipation and improve the energy efficiency of the chip. Researchers are eager to seek innovative low power architectures and techniques to relieve the ``dark silicon" problem and effectively convert transistors to performance. To demonstrate that many-core processors with network-on-chip interconnects is a promising architecture for high-performance energy-efficient computing, 16 Advanced Encryption Standard (AES) engines are proposed on a fine-grained many-core system by exploring different granularities of data-level and task-level parallelism. The smallest design utilizes only six cores for offline key expansion and eight cores for online key expansion, while the largest requires 107 cores and 137 cores, respectively. In comparison with published AES cipher implementations on general purpose processors, the designs have has 3.5--15.6 times higher throughput per unit of chip area and 8.2--18.1 times higher energy efficiency. Moreover, the design shows 2.0 times higher throughput than the TI DSP C6201, and 3.3 times higher throughput per unit of chip area and 2.9 times higher energy efficiency than the GeForce 8800 GTX. Next, a scalable joint local and global dynamic voltage and frequency scaling (DVFS) scheme is proposed to further improve the energy efficiency for many-core systems by monitoring on-line workload variations. The local algorithms selects the voltage and frequency pair for each individual core based on its FIFO occupancy and stall information, while the global algorithm tunes the global voltage supplies based on the workload of all active processors. To demonstrate the effectiveness of the proposed solution, a suite of benchmarks are tested on a many-core globally asynchronous locally synchronous (GALS) platform. The experiment results show that the proposed approach can achieve near-optimal power saving under performance constraints. Different local algorithms are compared in terms of power saving, voltage switching frequency and response delay to workload variation. The impact of the number of voltage supplies and global voltage tuning resolution on the global algorithm is also investigated. To further improve the energy efficiency beyond traditional DVFS, core scaling is proposed by introducing an extra dimension beyond supply voltage and clock frequency scaling. This dissertation addresses the problem of minimizing the power dissipation of many-core systems under performance constraints by choosing an appropriate number of active cores and per-core voltage/frequency levels. A genetic algorithm based solution is proposed to solve the problem. Experiments with real applications show that (1) dynamically scaling the number of active cores can improve the energy efficiency by 5% to 42% compared with per-core DVFS for different performance requirements; (2) core scaling favors systems with more global voltage supplies and high-performance leaky process when the performance requirement is loose, while it favors systems with fewer global voltage supplies and low-power less-leaky process when the performance requirement is tight; (3) increasing the number of global voltage supplies or leakage ratio can reduce the optimal core count by 22% and 50%, respectively.

Low-power System-on-chip Processors for Energy Efficient High Performance Computing

Low-power System-on-chip Processors for Energy Efficient High Performance Computing
Title Low-power System-on-chip Processors for Energy Efficient High Performance Computing PDF eBook
Author Gaurav Mitra
Publisher
Pages 0
Release 2017
Genre
ISBN

Download Low-power System-on-chip Processors for Energy Efficient High Performance Computing Book in PDF, Epub and Kindle

The High-Performance Computing (HPC) community recognizes energy consumption as a major problem. Extensive research is underway to identify energy-efficient building blocks for future HPC systems. This thesis considers one such system, the Texas Instruments Keystone II, a heterogeneous Low-Power System-on-Chip (LPSoC) processor that combines a quad-core ARM CPU with an octa-core Digital Signal Processor (DSP). It was first released in 2012.Four issues are considered: i) maximizing the Keystone II ARM CPU performance; ii) implementation of the OpenMP programming model for the Keystone II; iii) simultaneous use of ARM and DSP cores across multiple Keystone SoCs; and iv) an energy model for applications running on LPSoCs like the Keystone II and heterogeneous systems in general. Maximizing the performance of the ARM CPU on the Keystone II system is fundamental to its adoption by the HPC community. Key to achieving good performance is exploitation of the ARM vector instructions. This thesis presents the first detailed comparison of the use of ARM compiler intrinsic functions with automatic compiler vectorization across four generations of ARM processors. Comparisons are also made with x86 based platforms and the use of equivalent Intel vector instructions.Implementation of the OpenMP programming model on the Keystone II presents both challenges and opportunities. Challenges in that the OpenMP model was originally developed for a homogeneous environment, and in 2012 work had only just begun to consider its use with accelerators. Opportunities in that shared memory is accessible to all processing elements on the LPSoC. An implementation for the Keystone II that maps OpenMP 4.0 accelerator directives to OpenCL runtime library operations is presented and evaluated. Exploitation of some of the underlying hardware features of the Keystone II is also discussed. Simultaneous use of the ARM and DSP cores across multiple Keystone II boards is fundamental to the creation of commercially viable HPC offering. This thesis presents a proof-of-concept implementation of matrix multiplication (GEMM) on such a commercial system, the nCore BrownDwarf. The BrownDwarf utilizes both Keystone II and Keystone I SoCs through a point-to-point interconnect called Hyperlink. Details of how a novel message passing communication framework across Hyperlink was implemented to support this complex environment are provided.An energy model that can be used to predict energy usage as a function of what fraction of a computation is performed on each of the available compute devices offers the opportunity for making runtime decisions on how best to minimize energy usage. This thesis presents such an energy usage model. Using this model shows that only under certain conditions does there exist an energy-optimal work partition that uses multiple compute devices. To validate the model a high-resolution energy measurement environment is developed and used to gather energy measurements for a matrix multiplication running on a variety of systems. Results presented support the model. Drawing on the four issues noted above and other developments that have occurred since the Keystone II system was first announced, the thesis concludes by making comments regarding the future of LPSoCs as building blocks for HPC systems.