Designing a Modern Skeleton Programming Framework for Parallel and Heterogeneous Systems

Designing a Modern Skeleton Programming Framework for Parallel and Heterogeneous Systems
Title Designing a Modern Skeleton Programming Framework for Parallel and Heterogeneous Systems PDF eBook
Author August Ernstsson
Publisher Linköping University Electronic Press
Pages 155
Release 2020-10-21
Genre
ISBN 9179297722

Download Designing a Modern Skeleton Programming Framework for Parallel and Heterogeneous Systems Book in PDF, Epub and Kindle

Today's society is increasingly software-driven and dependent on powerful computer technology. Therefore it is important that advancements in the low-level processor hardware are made available for exploitation by a growing number of programmers of differing skill level. However, as we are approaching the end of Moore's law, hardware designers are finding new and increasingly complex ways to increase the accessible processor performance. It is getting more and more difficult to effectively target these processing resources without expert knowledge in parallelization, heterogeneous computation, communication, synchronization, and so on. To ensure that the software side can keep up, advanced programming environments and frameworks are needed to bridge the widening gap between hardware and software. One such example is the pattern-centric skeleton programming model and in particular the SkePU project. The work presented in this thesis first redesigns the SkePU framework based on modern C++ variadic template metaprogramming and state-of-the-art compiler technology. It then explores new ways to improve performance: by providing new patterns, improving the data access locality of existing ones, and using both static and dynamic knowledge about program flow. The work combines novel ideas with practical evaluation of the approach on several applications. The advancements also include the first skeleton API that allows variadic skeletons, new data containers, and finally an approach to make skeleton programming more customizable without compromising universal portability.

Formal Verification of Tree Ensembles in Safety-Critical Applications

Formal Verification of Tree Ensembles in Safety-Critical Applications
Title Formal Verification of Tree Ensembles in Safety-Critical Applications PDF eBook
Author John Törnblom
Publisher Linköping University Electronic Press
Pages 22
Release 2020-10-28
Genre
ISBN 917929748X

Download Formal Verification of Tree Ensembles in Safety-Critical Applications Book in PDF, Epub and Kindle

In the presence of data and computational resources, machine learning can be used to synthesize software automatically. For example, machines are now capable of learning complicated pattern recognition tasks and sophisticated decision policies, two key capabilities in autonomous cyber-physical systems. Unfortunately, humans find software synthesized by machine learning algorithms difficult to interpret, which currently limits their use in safety-critical applications such as medical diagnosis and avionic systems. In particular, successful deployments of safety-critical systems mandate the execution of rigorous verification activities, which often rely on human insights, e.g., to identify scenarios in which the system shall be tested. A natural pathway towards a viable verification strategy for such systems is to leverage formal verification techniques, which, in the presence of a formal specification, can provide definitive guarantees with little human intervention. However, formal verification suffers from scalability issues with respect to system complexity. In this thesis, we investigate the limits of current formal verification techniques when applied to a class of machine learning models called tree ensembles, and identify model-specific characteristics that can be exploited to improve the performance of verification algorithms when applied specifically to tree ensembles. To this end, we develop two formal verification techniques specifically for tree ensembles, one fast and conservative technique, and one exact but more computationally demanding. We then combine these two techniques into an abstraction-refinement approach, that we implement in a tool called VoTE (Verifier of Tree Ensembles). Using a couple of case studies, we recognize that sets of inputs that lead to the same system behavior can be captured precisely as hyperrectangles, which enables tractable enumeration of input-output mappings when the input dimension is low. Tree ensembles with a high-dimensional input domain, however, seems generally difficult to verify. In some cases though, conservative approximations of input-output mappings can greatly improve performance. This is demonstrated in a digit recognition case study, where we assess the robustness of classifiers when confronted with additive noise.

Programming Massively Parallel Processors

Programming Massively Parallel Processors
Title Programming Massively Parallel Processors PDF eBook
Author David B. Kirk
Publisher Newnes
Pages 519
Release 2012-12-31
Genre Computers
ISBN 0123914183

Download Programming Massively Parallel Processors Book in PDF, Epub and Kindle

Programming Massively Parallel Processors: A Hands-on Approach, Second Edition, teaches students how to program massively parallel processors. It offers a detailed discussion of various techniques for constructing parallel programs. Case studies are used to demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. This guide shows both student and professional alike the basic concepts of parallel programming and GPU architecture. Topics of performance, floating-point format, parallel patterns, and dynamic parallelism are covered in depth. This revised edition contains more parallel programming examples, commonly-used libraries such as Thrust, and explanations of the latest tools. It also provides new coverage of CUDA 5.0, improved performance, enhanced development tools, increased hardware support, and more; increased coverage of related technology, OpenCL and new material on algorithm patterns, GPU clusters, host programming, and data parallelism; and two new case studies (on MRI reconstruction and molecular visualization) that explore the latest applications of CUDA and GPUs for scientific research and high-performance computing. This book should be a valuable resource for advanced students, software engineers, programmers, and hardware engineers. New coverage of CUDA 5.0, improved performance, enhanced development tools, increased hardware support, and more Increased coverage of related technology, OpenCL and new material on algorithm patterns, GPU clusters, host programming, and data parallelism Two new case studies (on MRI reconstruction and molecular visualization) explore the latest applications of CUDA and GPUs for scientific research and high-performance computing

Advanced Parallel Processing Technologies

Advanced Parallel Processing Technologies
Title Advanced Parallel Processing Technologies PDF eBook
Author Ming Xu
Publisher Springer
Pages 782
Release 2007-11-07
Genre Computers
ISBN 3540768378

Download Advanced Parallel Processing Technologies Book in PDF, Epub and Kindle

This book constitutes the refereed proceedings of the 7th International Workshop on Advanced Parallel Processing Technologies, APPT 2007, held in Guangzhou, China, in November 2007. The 78 revised full papers presented were carefully reviewed and selected from 346 submissions. All current aspects in parallel and distributed computing are addressed ranging from hardware and software issues to algorithmic aspects and advanced applications. The papers are organized in topical sections.

并行程序设计

并行程序设计
Title 并行程序设计 PDF eBook
Author Foster
Publisher
Pages 381
Release 2002
Genre Computer programming
ISBN 9787115103475

Download 并行程序设计 Book in PDF, Epub and Kindle

国外著名高等院校信息科学与技术优秀教材

Applications, Tools and Techniques on the Road to Exascale Computing

Applications, Tools and Techniques on the Road to Exascale Computing
Title Applications, Tools and Techniques on the Road to Exascale Computing PDF eBook
Author Koen de Bosschere
Publisher IOS Press
Pages 688
Release 2012
Genre Computers
ISBN 1614990409

Download Applications, Tools and Techniques on the Road to Exascale Computing Book in PDF, Epub and Kindle

Single processing units have now reached a point where further major improvements in their performance are restricted by their physical limitations. This is causing a slowing down in advances at the same time as new scientific challenges are demanding exascale speed. This has meant that parallel processing has become key to High Performance Computing (HPC). This book contains the proceedings of the 14th biennial ParCo conference, ParCo2011, held in Ghent, Belgium. The ParCo conferences have traditionally concentrated on three main themes: Algorithms, Architectures and Applications. Nowadays though, the focus has shifted from traditional multiprocessor topologies to heterogeneous and manycores, incorporating standard CPUs, GPUs (Graphics Processing Units) and FPGAs (Field Programmable Gate Arrays). These platforms are, at a higher abstraction level, integrated in clusters, grids and clouds. The papers presented here reflect this change of focus. New architectures, programming tools and techniques are also explored, and the need for exascale hardware and software was also discussed in the industrial session of the conference.This book will be of interest to all those interested in parallel computing today, and progress towards the exascale computing of tomorrow.

Scaling High Performance Domain-specific Language Implementation with Delite

Scaling High Performance Domain-specific Language Implementation with Delite
Title Scaling High Performance Domain-specific Language Implementation with Delite PDF eBook
Author Hassan Chafi
Publisher
Pages
Release 2014
Genre
ISBN

Download Scaling High Performance Domain-specific Language Implementation with Delite Book in PDF, Epub and Kindle

The multicore era is now in full swing: single threaded processors with deep complex pipelines have been replaced with an increasing number of simpler processors. The shift to these multicore designs is motivated by the need for energy efficiency in addition to high-performance. There is now mounting evidence that further improvements in energy efficiency and performance will come from heterogeneous hardware. Programming heterogeneous hardware systems is difficult, which limits their utility. Each heterogeneous computing element has its own performance characteristics and pitfalls, and usually comes with its own programming model. This means that applications cannot take advantage of the additional compute power available in these new and emerging systems without a significant parallel programming effort. To simplify the complexity of programming heterogeneous hardware, one viable approach is the use of Domain-Specific Languages (DSLs) to develop algorithms at very high levels of abstraction. A corresponding DSL compiler can then reason about high-level domain knowledge now explicitly encoded in the application and generate efficient implementations of the algorithm for the different heterogeneous computing elements. This shifts most of the burden of parallelization to DSL authors requiring them to combine domain, programming language implementation and parallelization expertise. In this Thesis, we start by discussing the benefits of using such DSLs for parallel heterogeneous programming. We motivate the need for an infrastructure to simplify the effort required to author these high-performance DSLs. We then present the Delite Compiler Framework and Runtime, the result of our effort in designing and implementing such an infrastructure. The framework lifts embedded DSL applications to an intermediate representation (IR), performs generic, parallel, and domain-specific optimizations, and generates an execution graph that targets multiple heterogeneous hardware devices. One key component of this framework is a set of IR nodes, called Delite Ops, which simplify DSL parallelization by providing a set of re-usable and extensible parallel execution patterns. We illustrate the usefulness of Delite by showing examples of DSLs that have been implemented using this framework.