Deciphering Data Architectures
Title | Deciphering Data Architectures PDF eBook |
Author | James Serra |
Publisher | "O'Reilly Media, Inc." |
Pages | 262 |
Release | 2024-02-06 |
Genre | Computers |
ISBN | 1098150724 |
Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of these architectures to help data professionals understand the pros and cons of each. James Serra, big data and data warehousing solution architect at Microsoft, examines common data architecture concepts, including how data warehouses have had to evolve to work with data lake features. You'll learn what data lakehouses can help you achieve, as well as how to distinguish data mesh hype from reality. Best of all, you'll be able to determine the most appropriate data architecture for your needs. With this book, you'll: Gain a working understanding of several data architectures Learn the strengths and weaknesses of each approach Distinguish data architecture theory from reality Pick the best architecture for your use case Understand the differences between data warehouses and data lakes Learn common data architecture concepts to help you build better solutions Explore the historical evolution and characteristics of data architectures Learn essentials of running an architecture design session, team organization, and project success factors Free from product discussions, this book will serve as a timeless resource for years to come.
Deciphering Data Architectures
Title | Deciphering Data Architectures PDF eBook |
Author | James Serra |
Publisher | "O'Reilly Media, Inc." |
Pages | 278 |
Release | 2024-02-06 |
Genre | Computers |
ISBN | 1098150732 |
Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of these architectures to help data professionals understand the pros and cons of each. James Serra, big data and data warehousing solution architect at Microsoft, examines common data architecture concepts, including how data warehouses have had to evolve to work with data lake features. You'll learn what data lakehouses can help you achieve, as well as how to distinguish data mesh hype from reality. Best of all, you'll be able to determine the most appropriate data architecture for your needs. With this book, you'll: Gain a working understanding of several data architectures Learn the strengths and weaknesses of each approach Distinguish data architecture theory from reality Pick the best architecture for your use case Understand the differences between data warehouses and data lakes Learn common data architecture concepts to help you build better solutions Explore the historical evolution and characteristics of data architectures Learn essentials of running an architecture design session, team organization, and project success factors Free from product discussions, this book will serve as a timeless resource for years to come.
Architecting Modern Data Platforms
Title | Architecting Modern Data Platforms PDF eBook |
Author | Jan Kunigk |
Publisher | "O'Reilly Media, Inc." |
Pages | 688 |
Release | 2018-12-05 |
Genre | Computers |
ISBN | 1491969229 |
There’s a lot of information about big data technologies, but splicing these technologies into an end-to-end enterprise data platform is a daunting task not widely covered. With this practical book, you’ll learn how to build big data infrastructure both on-premises and in the cloud and successfully architect a modern data platform. Ideal for enterprise architects, IT managers, application architects, and data engineers, this book shows you how to overcome the many challenges that emerge during Hadoop projects. You’ll explore the vast landscape of tools available in the Hadoop and big data realm in a thorough technical primer before diving into: Infrastructure: Look at all component layers in a modern data platform, from the server to the data center, to establish a solid foundation for data in your enterprise Platform: Understand aspects of deployment, operation, security, high availability, and disaster recovery, along with everything you need to know to integrate your platform with the rest of your enterprise IT Taking Hadoop to the cloud: Learn the important architectural aspects of running a big data platform in the cloud while maintaining enterprise security and high availability
Data Management at Scale
Title | Data Management at Scale PDF eBook |
Author | Piethein Strengholt |
Publisher | "O'Reilly Media, Inc." |
Pages | 404 |
Release | 2020-07-29 |
Genre | Computers |
ISBN | 1492054739 |
As data management and integration continue to evolve rapidly, storing all your data in one place, such as a data warehouse, is no longer scalable. In the very near future, data will need to be distributed and available for several technological solutions. With this practical book, you’ll learnhow to migrate your enterprise from a complex and tightly coupled data landscape to a more flexible architecture ready for the modern world of data consumption. Executives, data architects, analytics teams, and compliance and governance staff will learn how to build a modern scalable data landscape using the Scaled Architecture, which you can introduce incrementally without a large upfront investment. Author Piethein Strengholt provides blueprints, principles, observations, best practices, and patterns to get you up to speed. Examine data management trends, including technological developments, regulatory requirements, and privacy concerns Go deep into the Scaled Architecture and learn how the pieces fit together Explore data governance and data security, master data management, self-service data marketplaces, and the importance of metadata
Data Architecture: A Primer for the Data Scientist
Title | Data Architecture: A Primer for the Data Scientist PDF eBook |
Author | W.H. Inmon |
Publisher | Academic Press |
Pages | 434 |
Release | 2019-04-30 |
Genre | Computers |
ISBN | 0128169176 |
Over the past 5 years, the concept of big data has matured, data science has grown exponentially, and data architecture has become a standard part of organizational decision-making. Throughout all this change, the basic principles that shape the architecture of data have remained the same. There remains a need for people to take a look at the "bigger picture" and to understand where their data fit into the grand scheme of things. Data Architecture: A Primer for the Data Scientist, Second Edition addresses the larger architectural picture of how big data fits within the existing information infrastructure or data warehousing systems. This is an essential topic not only for data scientists, analysts, and managers but also for researchers and engineers who increasingly need to deal with large and complex sets of data. Until data are gathered and can be placed into an existing framework or architecture, they cannot be used to their full potential. Drawing upon years of practical experience and using numerous examples and case studies from across various industries, the authors seek to explain this larger picture into which big data fits, giving data scientists the necessary context for how pieces of the puzzle should fit together. - New case studies include expanded coverage of textual management and analytics - New chapters on visualization and big data - Discussion of new visualizations of the end-state architecture
Data Mesh
Title | Data Mesh PDF eBook |
Author | Zhamak Dehghani |
Publisher | "O'Reilly Media, Inc." |
Pages | 387 |
Release | 2022-03-08 |
Genre | Computers |
ISBN | 1492092363 |
Many enterprises are investing in a next-generation data lake, hoping to democratize data at scale to provide business insights and ultimately make automated intelligent decisions. In this practical book, author Zhamak Dehghani reveals that, despite the time, money, and effort poured into them, data warehouses and data lakes fail when applied at the scale and speed of today's organizations. A distributed data mesh is a better choice. Dehghani guides architects, technical leaders, and decision makers on their journey from monolithic big data architecture to a sociotechnical paradigm that draws from modern distributed architecture. A data mesh considers domains as a first-class concern, applies platform thinking to create self-serve data infrastructure, treats data as a product, and introduces a federated and computational model of data governance. This book shows you why and how. Examine the current data landscape from the perspective of business and organizational needs, environmental challenges, and existing architectures Analyze the landscape's underlying characteristics and failure modes Get a complete introduction to data mesh principles and its constituents Learn how to design a data mesh architecture Move beyond a monolithic data lake to a distributed data mesh.
Foundations for Architecting Data Solutions
Title | Foundations for Architecting Data Solutions PDF eBook |
Author | Ted Malaska |
Publisher | "O'Reilly Media, Inc." |
Pages | 196 |
Release | 2018-08-29 |
Genre | Computers |
ISBN | 1492038695 |
While many companies ponder implementation details such as distributed processing engines and algorithms for data analysis, this practical book takes a much wider view of big data development, starting with initial planning and moving diligently toward execution. Authors Ted Malaska and Jonathan Seidman guide you through the major components necessary to start, architect, and develop successful big data projects. Everyone from CIOs and COOs to lead architects and developers will explore a variety of big data architectures and applications, from massive data pipelines to web-scale applications. Each chapter addresses a piece of the software development life cycle and identifies patterns to maximize long-term success throughout the life of your project. Start the planning process by considering the key data project types Use guidelines to evaluate and select data management solutions Reduce risk related to technology, your team, and vague requirements Explore system interface design using APIs, REST, and pub/sub systems Choose the right distributed storage system for your big data system Plan and implement metadata collections for your data architecture Use data pipelines to ensure data integrity from source to final storage Evaluate the attributes of various engines for processing the data you collect