An Open-Source Research Platform for Heterogeneous Systems on Chip

An Open-Source Research Platform for Heterogeneous Systems on Chip
Title An Open-Source Research Platform for Heterogeneous Systems on Chip PDF eBook
Author Andreas Dominic Kurth
Publisher BoD – Books on Demand
Pages 282
Release 2022-10-05
Genre Science
ISBN 3866287747

Download An Open-Source Research Platform for Heterogeneous Systems on Chip Book in PDF, Epub and Kindle

Heterogeneous systems on chip (HeSoCs) combine general-purpose, feature-rich multi-core host processors with domain-specific programmable many-core accelerators (PMCAs) to unite versatility with energy efficiency and peak performance. By virtue of their heterogeneity, HeSoCs hold the promise of increasing performance and energy efficiency compared to homogeneous multiprocessors, because applications can be executed on hardware that is designed for them. However, this heterogeneity also increases system complexity substantially. This thesis presents the first research platform for HeSoCs where all components, from accelerator cores to application programming interface, are available under permissive open-source licenses. We begin by identifying the hardware and software components that are required in HeSoCs and by designing a representative hardware and software architecture. We then design, implement, and evaluate four critical HeSoC components that have not been discussed in research at the level required for an open-source implementation: First, we present a modular, topology-agnostic, high-performance on-chip communication platform, which adheres to a state-of-the-art industry-standard protocol. We show that the platform can be used to build high-bandwidth (e.g., 2.5 GHz and 1024 bit data width) end-to-end communication fabrics with high degrees of concurrency (e.g., up to 256 independent concurrent transactions). Second, we present a modular and efficient solution for implementing atomic memory operations in highly-scalable many-core processors, which demonstrates near-optimal linear throughput scaling for various synthetic and real-world workloads and requires only 0.5 kGE per core. Third, we present a hardware-software solution for shared virtual memory that avoids the majority of translation lookaside buffer misses with prefetching, supports parallel burst transfers without additional buffers, and can be scaled with the workload and number of parallel processors. Our work improves accelerator performance for memory-intensive kernels by up to 4×. Fourth, we present a software toolchain for mixed-data-model heterogeneous compilation and OpenMP offloading. Our work enables transparent memory sharing between a 64-bit host processor and a 32-bit accelerator at overheads below 0.7 % compared to 32-bit-only execution. Finally, we combine our contributions to a research platform for state-of-the-art HeSoCs and demonstrate its performance and flexibility.

Fighting Back the Von Neumann Bottleneck with Small- and Large-Scale Vector Microprocessors

Fighting Back the Von Neumann Bottleneck with Small- and Large-Scale Vector Microprocessors
Title Fighting Back the Von Neumann Bottleneck with Small- and Large-Scale Vector Microprocessors PDF eBook
Author Matheus Cavalcante
Publisher BoD – Books on Demand
Pages 224
Release 2023-08-24
Genre
ISBN 3866288018

Download Fighting Back the Von Neumann Bottleneck with Small- and Large-Scale Vector Microprocessors Book in PDF, Epub and Kindle

In his seminal Turing Award Lecture, Backus discussed the issues stemming from the word-at-a-time style of programming inherited from the von Neumann computer. More than forty years later, computer architects must be creative to amortize the von Neumann Bottleneck (VNB) associated with fetching and decoding instructions which only keep the datapath busy for a very short period of time. In particular, vector processors promise to be one of the most efficient architectures to tackle the VNB, by amortizing the energy overhead of instruction fetching and decoding over several chunks of data. This work explores vector processing as an option to build small and efficient processing elements for large-scale clusters of cores sharing access to tightly-coupled L1 memory

An Event-Driven Parallel-Processing Subsystem for Energy-Efficient Mobile Medical Instrumentation

An Event-Driven Parallel-Processing Subsystem for Energy-Efficient Mobile Medical Instrumentation
Title An Event-Driven Parallel-Processing Subsystem for Energy-Efficient Mobile Medical Instrumentation PDF eBook
Author Florian Stefan Glaser
Publisher BoD – Books on Demand
Pages 216
Release 2022-12-02
Genre Technology & Engineering
ISBN 3866287771

Download An Event-Driven Parallel-Processing Subsystem for Energy-Efficient Mobile Medical Instrumentation Book in PDF, Epub and Kindle

Aging population and the thereby ever-rising cost of health services call for novel and innovative solutions for providing medical care and services. So far, medical care is primarily provided in the form of time-consuming in-person appointments with trained personnel and expensive, stationary instrumentation equipment. As for many current and past challenges, the advances in microelectronics are a crucial enabler and offer a plethora of opportunities. With key building blocks such as sensing, processing, and communication systems and circuits getting smaller, cheaper, and more energy-efficient, personal and wearable or even implantable point-of-care devices with medicalgrade instrumentation capabilities become feasible. Device size and battery lifetime are paramount for the realization of such devices. Besides integrating the required functionality into as few individual microelectronic components as possible, the energy efficiency of such is crucial to reduce battery size, usually being the dominant contributor to overall device size. In this thesis, we present two major contributions to achieve the discussed goals in the context of miniaturized medical instrumentation: First, we present a synchronization solution for embedded, parallel near-threshold computing (NTC), a promising concept for enabling the required processing capabilities with an energy efficiency that is suitable for highly mobile devices with very limited battery capacity. Our proposed solution aims at increasing energy efficiency and performance for parallel NTC clusters by maximizing the effective utilization of the available cores under parallel workloads. We describe a hardware unit that enables fine-grain parallelization by greatly optimizing and accelerating core-to-core synchronization and communication and analyze the impact of those mechanisms on the overall performance and energy efficiency of an eight-core cluster. With a range of digital signal processing (DSP) applications typical for the targeted systems, the proposed hardware unit improves performance by up to 92% and 23% on average and energy efficiency by up to 98% and 39% on average. In the second part, we present a MCU processing and control subsystem (MPCS) for the integration into VivoSoC, a highly versatile single-chip solution for mobile medical instrumentation. In addition to the MPCS, it includes a multitude of analog front-ends (AFEs) and a multi-channel power management IC (PMIC) for voltage conversion. ...

Heterogeneous System Architecture

Heterogeneous System Architecture
Title Heterogeneous System Architecture PDF eBook
Author Wen-mei W. Hwu
Publisher Morgan Kaufmann
Pages 207
Release 2015-11-20
Genre Computers
ISBN 0128008016

Download Heterogeneous System Architecture Book in PDF, Epub and Kindle

Heterogeneous Systems Architecture - a new compute platform infrastructure presents a next-generation hardware platform, and associated software, that allows processors of different types to work efficiently and cooperatively in shared memory from a single source program. HSA also defines a virtual ISA for parallel routines or kernels, which is vendor and ISA independent thus enabling single source programs to execute across any HSA compliant heterogeneous processer from those used in smartphones to supercomputers. The book begins with an overview of the evolution of heterogeneous parallel processing, associated problems, and how they are overcome with HSA. Later chapters provide a deeper perspective on topics such as the runtime, memory model, queuing, context switching, the architected queuing language, simulators, and tool chains. Finally, three real world examples are presented, which provide an early demonstration of how HSA can deliver significantly higher performance thru C++ based applications. Contributing authors are HSA Foundation members who are experts from both academia and industry. Some of these distinguished authors are listed here in alphabetical order: Yeh-Ching Chung, Benedict R. Gaster, Juan Gómez-Luna, Derek Hower, Lee Howes, Shih-Hao HungThomas B. Jablin, David Kaeli,Phil Rogers, Ben Sander, I-Jui (Ray) Sung. - Provides clear and concise explanations of key HSA concepts and fundamentals by expert HSA Specification contributors - Explains how performance-bound programming algorithms and application types can be significantly optimized by utilizing HSA hardware and software features - Presents HSA simply, clearly, and concisely without reading the detailed HSA Specification documents - Demonstrates ideal mapping of processing resources from CPUs to many other heterogeneous processors that comply with HSA Specifications

Third Many-core Applications Research Community (MARC) Symposium

Third Many-core Applications Research Community (MARC) Symposium
Title Third Many-core Applications Research Community (MARC) Symposium PDF eBook
Author Diana Göhringer
Publisher KIT Scientific Publishing
Pages 122
Release 2011
Genre
ISBN 3866447175

Download Third Many-core Applications Research Community (MARC) Symposium Book in PDF, Epub and Kindle

Silicon Systems For Wireless Lan

Silicon Systems For Wireless Lan
Title Silicon Systems For Wireless Lan PDF eBook
Author Zoran Stamenkovic
Publisher World Scientific
Pages 430
Release 2020-11-27
Genre Computers
ISBN 981121073X

Download Silicon Systems For Wireless Lan Book in PDF, Epub and Kindle

Today's integrated silicon circuits and systems for wireless communications are of a huge complexity.This unique compendium covers all the steps (from the system-level to the transistor-level) necessary to design, model, verify, implement, and test a silicon system. It bridges the gap between the system-world and the transistor-world (between communication, system, circuit, device, and test engineers).It is extremely important nowadays (and will be more important in the future) for communication, system, and circuit engineers to understand the physical implications of system and circuit solutions based on hardware/software co-design as well as for device and test engineers to cope with the system and circuit requirements in terms of power, speed, and data throughput.Related Link(s)

Design and Architectures for Signal and Image Processing

Design and Architectures for Signal and Image Processing
Title Design and Architectures for Signal and Image Processing PDF eBook
Author Tiago Dias
Publisher Springer Nature
Pages 131
Release
Genre
ISBN 3031628748

Download Design and Architectures for Signal and Image Processing Book in PDF, Epub and Kindle