Energy Efficient High Performance Processors

Title	Energy Efficient High Performance Processors PDF eBook
Author	Jawad Haj-Yahya
Publisher	Springer
Pages	176
Release	2018-03-22
Genre	Technology & Engineering
ISBN	9811085544

GET E-BOOK HERE

Download Energy Efficient High Performance Processors Book in PDF, Epub and Kindle

This book explores energy efficiency techniques for high-performance computing (HPC) systems using power-management methods. Adopting a step-by-step approach, it describes power-management flows, algorithms and mechanism that are employed in modern processors such as Intel Sandy Bridge, Haswell, Skylake and other architectures (e.g. ARM). Further, it includes practical examples and recent studies demonstrating how modem processors dynamically manage wide power ranges, from a few milliwatts in the lowest idle power state, to tens of watts in turbo state. Moreover, the book explains how thermal and power deliveries are managed in the context this huge power range. The book also discusses the different metrics for energy efficiency, presents several methods and applications of the power and energy estimation, and shows how by using innovative power estimation methods and new algorithms modern processors are able to optimize metrics such as power, energy, and performance. Different power estimation tools are presented, including tools that break down the power consumption of modern processors at sub-processor core/thread granularity. The book also investigates software, firmware and hardware coordination methods of reducing power consumption, for example a compiler-assisted power management method to overcome power excursions. Lastly, it examines firmware algorithms for dynamic cache resizing and dynamic voltage and frequency scaling (DVFS) for memory sub-systems.

Energy-Efficient High Performance Computing

Title	Energy-Efficient High Performance Computing PDF eBook
Author	James H. Laros III
Publisher	Springer
Pages	0
Release	2012-09-04
Genre	Computers
ISBN	9781447144915

GET E-BOOK HERE

Download Energy-Efficient High Performance Computing Book in PDF, Epub and Kindle

In this work, the unique power measurement capabilities of the Cray XT architecture were exploited to gain an understanding of power and energy use, and the effects of tuning both CPU and network bandwidth. Modifications were made to deterministically halt cores when idle. Additionally, capabilities were added to alter operating P-state. At the application level, an understanding of the power requirements of a range of important DOE/NNSA production scientific computing applications running at large scale is gained by simultaneously collecting current and voltage measurements on the hosting nodes. The effects of both CPU and network bandwidth tuning are examined, and energy savings opportunities without impact on run-time performance are demonstrated. This research suggests that next-generation large-scale platforms should not only approach CPU frequency scaling differently, but could also benefit from the capability to tune other platform components to achieve more energy-efficient performance.

Energy Efficient Microprocessor Design

Title	Energy Efficient Microprocessor Design PDF eBook
Author	Thomas D. Burd
Publisher	Springer Science & Business Media
Pages	384
Release	2002
Genre	Computers
ISBN	9780792375869

GET E-BOOK HERE

Download Energy Efficient Microprocessor Design Book in PDF, Epub and Kindle

This volume starts with a description of the metrics and benchmarks used to design energy-efficient microprocessor systems, followed by energy-efficient methodologies for the architecture and circuit design, DC-DC conversion, energy-efficient software and system integration.

High-Performance Energy-Efficient Microprocessor Design

Title	High-Performance Energy-Efficient Microprocessor Design PDF eBook
Author	Vojin G. Oklobdzija
Publisher	Springer Science & Business Media
Pages	342
Release	2007-04-27
Genre	Technology & Engineering
ISBN	0387340475

GET E-BOOK HERE

Download High-Performance Energy-Efficient Microprocessor Design Book in PDF, Epub and Kindle

Written by the world’s most prominent microprocessor design leaders from industry and academia, this book provides complete coverage of all aspects of complex microprocessor design: technology, power management, clocking, high-performance architecture, design methodologies, memory and I/O design, computer aided design, testing and design for testability. The chapters provide state-of-the-art knowledge while including sufficient tutorial material to bring non-experts up to speed. A useful companion to design engineers working in related areas.

High Energy Efficiency Neural Network Processor with Combined Digital and Computing-in-Memory Architecture

Title	High Energy Efficiency Neural Network Processor with Combined Digital and Computing-in-Memory Architecture PDF eBook
Author	Jinshan Yue
Publisher	Springer Nature
Pages	128
Release
Genre
ISBN	9819734770

GET E-BOOK HERE

Download High Energy Efficiency Neural Network Processor with Combined Digital and Computing-in-Memory Architecture Book in PDF, Epub and Kindle

Energy-efficient Computing with Fine-grained Many-core Systems

Title	Energy-efficient Computing with Fine-grained Many-core Systems PDF eBook
Author	Bin Liu
Publisher
Pages
Release	2016
Genre
ISBN	9781369615579

GET E-BOOK HERE

Download Energy-efficient Computing with Fine-grained Many-core Systems Book in PDF, Epub and Kindle

For the past half century, Moore's Law has been the fundamental driver of high-performance computing. The continued CMOS technology scaling doubles the transistor density of VLSI systems and had provided a predictable 40% performance improvement of single-core processors for every 18 to 24 months. However, as Dennard Scaling ends, the era of scaling frequency and performance without increasing power density is over. Since 2005, the semiconductor industry shifted to multi-core and many-core processors in order to sustain the proportional scaling of performance along with transistor count increases. One of the critical challenges for many-core system design is to reduce the power dissipation and improve the energy efficiency of the chip. Researchers are eager to seek innovative low power architectures and techniques to relieve the ``dark silicon" problem and effectively convert transistors to performance. To demonstrate that many-core processors with network-on-chip interconnects is a promising architecture for high-performance energy-efficient computing, 16 Advanced Encryption Standard (AES) engines are proposed on a fine-grained many-core system by exploring different granularities of data-level and task-level parallelism. The smallest design utilizes only six cores for offline key expansion and eight cores for online key expansion, while the largest requires 107 cores and 137 cores, respectively. In comparison with published AES cipher implementations on general purpose processors, the designs have has 3.5--15.6 times higher throughput per unit of chip area and 8.2--18.1 times higher energy efficiency. Moreover, the design shows 2.0 times higher throughput than the TI DSP C6201, and 3.3 times higher throughput per unit of chip area and 2.9 times higher energy efficiency than the GeForce 8800 GTX. Next, a scalable joint local and global dynamic voltage and frequency scaling (DVFS) scheme is proposed to further improve the energy efficiency for many-core systems by monitoring on-line workload variations. The local algorithms selects the voltage and frequency pair for each individual core based on its FIFO occupancy and stall information, while the global algorithm tunes the global voltage supplies based on the workload of all active processors. To demonstrate the effectiveness of the proposed solution, a suite of benchmarks are tested on a many-core globally asynchronous locally synchronous (GALS) platform. The experiment results show that the proposed approach can achieve near-optimal power saving under performance constraints. Different local algorithms are compared in terms of power saving, voltage switching frequency and response delay to workload variation. The impact of the number of voltage supplies and global voltage tuning resolution on the global algorithm is also investigated. To further improve the energy efficiency beyond traditional DVFS, core scaling is proposed by introducing an extra dimension beyond supply voltage and clock frequency scaling. This dissertation addresses the problem of minimizing the power dissipation of many-core systems under performance constraints by choosing an appropriate number of active cores and per-core voltage/frequency levels. A genetic algorithm based solution is proposed to solve the problem. Experiments with real applications show that (1) dynamically scaling the number of active cores can improve the energy efficiency by 5% to 42% compared with per-core DVFS for different performance requirements; (2) core scaling favors systems with more global voltage supplies and high-performance leaky process when the performance requirement is loose, while it favors systems with fewer global voltage supplies and low-power less-leaky process when the performance requirement is tight; (3) increasing the number of global voltage supplies or leakage ratio can reduce the optimal core count by 22% and 50%, respectively.

Low-power System-on-chip Processors for Energy Efficient High Performance Computing

Title	Low-power System-on-chip Processors for Energy Efficient High Performance Computing PDF eBook
Author	Gaurav Mitra
Publisher
Pages	0
Release	2017
Genre
ISBN

GET E-BOOK HERE

Download Low-power System-on-chip Processors for Energy Efficient High Performance Computing Book in PDF, Epub and Kindle

The High-Performance Computing (HPC) community recognizes energy consumption as a major problem. Extensive research is underway to identify energy-efficient building blocks for future HPC systems. This thesis considers one such system, the Texas Instruments Keystone II, a heterogeneous Low-Power System-on-Chip (LPSoC) processor that combines a quad-core ARM CPU with an octa-core Digital Signal Processor (DSP). It was first released in 2012.Four issues are considered: i) maximizing the Keystone II ARM CPU performance; ii) implementation of the OpenMP programming model for the Keystone II; iii) simultaneous use of ARM and DSP cores across multiple Keystone SoCs; and iv) an energy model for applications running on LPSoCs like the Keystone II and heterogeneous systems in general. Maximizing the performance of the ARM CPU on the Keystone II system is fundamental to its adoption by the HPC community. Key to achieving good performance is exploitation of the ARM vector instructions. This thesis presents the first detailed comparison of the use of ARM compiler intrinsic functions with automatic compiler vectorization across four generations of ARM processors. Comparisons are also made with x86 based platforms and the use of equivalent Intel vector instructions.Implementation of the OpenMP programming model on the Keystone II presents both challenges and opportunities. Challenges in that the OpenMP model was originally developed for a homogeneous environment, and in 2012 work had only just begun to consider its use with accelerators. Opportunities in that shared memory is accessible to all processing elements on the LPSoC. An implementation for the Keystone II that maps OpenMP 4.0 accelerator directives to OpenCL runtime library operations is presented and evaluated. Exploitation of some of the underlying hardware features of the Keystone II is also discussed. Simultaneous use of the ARM and DSP cores across multiple Keystone II boards is fundamental to the creation of commercially viable HPC offering. This thesis presents a proof-of-concept implementation of matrix multiplication (GEMM) on such a commercial system, the nCore BrownDwarf. The BrownDwarf utilizes both Keystone II and Keystone I SoCs through a point-to-point interconnect called Hyperlink. Details of how a novel message passing communication framework across Hyperlink was implemented to support this complex environment are provided.An energy model that can be used to predict energy usage as a function of what fraction of a computation is performed on each of the available compute devices offers the opportunity for making runtime decisions on how best to minimize energy usage. This thesis presents such an energy usage model. Using this model shows that only under certain conditions does there exist an energy-optimal work partition that uses multiple compute devices. To validate the model a high-resolution energy measurement environment is developed and used to gather energy measurements for a matrix multiplication running on a variety of systems. Results presented support the model. Drawing on the four issues noted above and other developments that have occurred since the Keystone II system was first announced, the thesis concludes by making comments regarding the future of LPSoCs as building blocks for HPC systems.

Energy Efficient High Performance Processors

Energy-Efficient High Performance Computing

Energy Efficient Microprocessor Design

High-Performance Energy-Efficient Microprocessor Design

High Energy Efficiency Neural Network Processor with Combined Digital and Computing-in-Memory Architecture

Energy-efficient Computing with Fine-grained Many-core Systems

Low-power System-on-chip Processors for Energy Efficient High Performance Computing

New Release