Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design
Title | Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design PDF eBook |
Author | Xiaowei Li |
Publisher | Springer Nature |
Pages | 318 |
Release | 2023-03-01 |
Genre | Computers |
ISBN | 9811985510 |
With the end of Dennard scaling and Moore’s law, IC chips, especially large-scale ones, now face more reliability challenges, and reliability has become one of the mainstay merits of VLSI designs. In this context, this book presents a built-in on-chip fault-tolerant computing paradigm that seeks to combine fault detection, fault diagnosis, and error recovery in large-scale VLSI design in a unified manner so as to minimize resource overhead and performance penalties. Following this computing paradigm, we propose a holistic solution based on three key components: self-test, self-diagnosis and self-repair, or “3S” for short. We then explore the use of 3S for general IC designs, general-purpose processors, network-on-chip (NoC) and deep learning accelerators, and present prototypes to demonstrate how 3S responds to in-field silicon degradation and recovery under various runtime faults caused by aging, process variations, or radical particles. Moreover, we demonstrate that 3S not only offers a powerful backbone for various on-chip fault-tolerant designs and implementations, but also has farther-reaching implications such as maintaining graceful performance degradation, mitigating the impact of verification blind spots, and improving chip yield. This book is the outcome of extensive fault-tolerant computing research pursued at the State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences over the past decade. The proposed built-in on-chip fault-tolerant computing paradigm has been verified in a broad range of scenarios, from small processors in satellite computers to large processors in HPCs. Hopefully, it will provide an alternative yet effective solution to the growing reliability challenges for large-scale VLSI designs.
Fault Tolerant Computer Architecture
Title | Fault Tolerant Computer Architecture PDF eBook |
Author | Daniel Sorin |
Publisher | Morgan & Claypool Publishers |
Pages | 116 |
Release | 2009-07-08 |
Genre | Technology & Engineering |
ISBN | 1598299549 |
For many years, most computer architects have pursued one primary goal: performance. Architects have translated the ever-increasing abundance of ever-faster transistors provided by Moore's law into remarkable increases in performance. Recently, however, the bounty provided by Moore's law has been accompanied by several challenges that have arisen as devices have become smaller, including a decrease in dependability due to physical faults. In this book, we focus on the dependability challenge and the fault tolerance solutions that architects are developing to overcome it. The two main purposes of this book are to explore the key ideas in fault-tolerant computer architecture and to present the current state-of-the-art - over approximately the past 10 years - in academia and industry. Table of Contents: Introduction / Error Detection / Error Recovery / Diagnosis / Self-Repair / The Future
Software-Implemented Hardware Fault Tolerance
Title | Software-Implemented Hardware Fault Tolerance PDF eBook |
Author | Olga Goloubeva |
Publisher | Springer Science & Business Media |
Pages | 238 |
Release | 2006-09-19 |
Genre | Technology & Engineering |
ISBN | 0387329374 |
This book presents the theory behind software-implemented hardware fault tolerance, as well as the practical aspects needed to put it to work on real examples. By evaluating accurately the advantages and disadvantages of the already available approaches, the book provides a guide to developers willing to adopt software-implemented hardware fault tolerance in their applications. Moreover, the book identifies open issues for researchers willing to improve the already available techniques.
Cities and Their Vital Systems
Title | Cities and Their Vital Systems PDF eBook |
Author | Advisory Committee on Technology and Society |
Publisher | National Academies Press |
Pages | 1298 |
Release | 1989 |
Genre | Social Science |
ISBN | 9780309037860 |
Cities and Their Vital Systems asks basic questions about the longevity, utility, and nature of urban infrastructures; analyzes how they grow, interact, and change; and asks how, when, and at what cost they should be replaced. Among the topics discussed are problems arising from increasing air travel and airport congestion; the adequacy of water supplies and waste treatment; the impact of new technologies on construction; urban real estate values; and the field of "telematics," the combination of computers and telecommunications that makes money machines and national newspapers possible.
Networks on Chip
Title | Networks on Chip PDF eBook |
Author | Axel Jantsch |
Publisher | Springer Science & Business Media |
Pages | 304 |
Release | 2007-05-08 |
Genre | Computers |
ISBN | 0306487276 |
As the number of processor cores and IP blocks integrated on a single chip is steadily growing, a systematic approach to design the communication infrastructure becomes necessary. Different variants of packed switched on-chip networks have been proposed by several groups during the past two years. This book summarizes the state of the art of these efforts and discusses the major issues from the physical integration to architecture to operating systems and application interfaces. It also provides a guideline and vision about the direction this field is moving to. Moreover, the book outlines the consequences of adopting design platforms based on packet switched network. The consequences may in fact be far reaching because many of the topics of distributed systems, distributed real-time systems, fault tolerant systems, parallel computer architecture, parallel programming as well as traditional system-on-chip issues will appear relevant but within the constraints of a single chip VLSI implementation.
Government Reports Announcements & Index
Title | Government Reports Announcements & Index PDF eBook |
Author | |
Publisher | |
Pages | 1002 |
Release | 1989 |
Genre | Government publications |
ISBN |
Fault-Tolerant Design
Title | Fault-Tolerant Design PDF eBook |
Author | Elena Dubrova |
Publisher | Springer Science & Business Media |
Pages | 195 |
Release | 2013-03-15 |
Genre | Technology & Engineering |
ISBN | 1461421136 |
This textbook serves as an introduction to fault-tolerance, intended for upper-division undergraduate students, graduate-level students and practicing engineers in need of an overview of the field. Readers will develop skills in modeling and evaluating fault-tolerant architectures in terms of reliability, availability and safety. They will gain a thorough understanding of fault tolerant computers, including both the theory of how to design and evaluate them and the practical knowledge of achieving fault-tolerance in electronic, communication and software systems. Coverage includes fault-tolerance techniques through hardware, software, information and time redundancy. The content is designed to be highly accessible, including numerous examples and exercises. Solutions and powerpoint slides are available for instructors.