A Hands-on Introduction to Big Data Analytics
Title | A Hands-on Introduction to Big Data Analytics PDF eBook |
Author | Funmi Obembe |
Publisher | SAGE Publications Limited |
Pages | 393 |
Release | 2024-02-24 |
Genre | Business & Economics |
ISBN | 1529615925 |
This practical textbook offers a hands-on introduction to big data analytics, helping you to develop the skills required to hit the ground running as a data professional. It complements theoretical foundations with an emphasis on the application of big data analytics, illustrated by real-life examples and datasets. Containing comprehensive coverage of all the key topics in this area, this book uses open-source technologies and examples in Python and Apache Spark. Learning features include: - Ethics by Design encourages you to consider data ethics at every stage. - Industry Insights facilitate a deeper understanding of the link between what you are studying and how it is applied in industry. - Datasets, questions, and exercises give you the opportunity to apply your learning. Dr Funmi Obembe is the Head of Technology at the Faculty of Arts, Science and Technology, University of Northampton. Dr Ofer Engel is a Data Scientist at the University of Groningen.
Big Data Science & Analytics
Title | Big Data Science & Analytics PDF eBook |
Author | Arshdeep Bahga |
Publisher | Vpt |
Pages | 544 |
Release | 2016-04-15 |
Genre | Computers |
ISBN | 9780996025546 |
Big data is defined as collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process and analyze the data using traditional databases and data processing tools. We have written this textbook to meet this need at colleges and universities, and also for big data service providers.
A Hands-On Introduction to Data Science
Title | A Hands-On Introduction to Data Science PDF eBook |
Author | Chirag Shah |
Publisher | Cambridge University Press |
Pages | 459 |
Release | 2020-04-02 |
Genre | Business & Economics |
ISBN | 1108472443 |
An introductory textbook offering a low barrier entry to data science; the hands-on approach will appeal to students from a range of disciplines.
Hands-On Big Data Analytics with PySpark
Title | Hands-On Big Data Analytics with PySpark PDF eBook |
Author | Rudy Lai |
Publisher | Packt Publishing Ltd |
Pages | 172 |
Release | 2019-03-29 |
Genre | Computers |
ISBN | 1838648836 |
Use PySpark to easily crush messy data at-scale and discover proven techniques to create testable, immutable, and easily parallelizable Spark jobs Key FeaturesWork with large amounts of agile data using distributed datasets and in-memory cachingSource data from all popular data hosting platforms, such as HDFS, Hive, JSON, and S3Employ the easy-to-use PySpark API to deploy big data Analytics for productionBook Description Apache Spark is an open source parallel-processing framework that has been around for quite some time now. One of the many uses of Apache Spark is for data analytics applications across clustered computers. In this book, you will not only learn how to use Spark and the Python API to create high-performance analytics with big data, but also discover techniques for testing, immunizing, and parallelizing Spark jobs. You will learn how to source data from all popular data hosting platforms, including HDFS, Hive, JSON, and S3, and deal with large datasets with PySpark to gain practical big data experience. This book will help you work on prototypes on local machines and subsequently go on to handle messy data in production and at scale. This book covers installing and setting up PySpark, RDD operations, big data cleaning and wrangling, and aggregating and summarizing data into useful reports. You will also learn how to implement some practical and proven techniques to improve certain aspects of programming and administration in Apache Spark. By the end of the book, you will be able to build big data analytical solutions using the various PySpark offerings and also optimize them effectively. What you will learnGet practical big data experience while working on messy datasetsAnalyze patterns with Spark SQL to improve your business intelligenceUse PySpark's interactive shell to speed up development timeCreate highly concurrent Spark programs by leveraging immutabilityDiscover ways to avoid the most expensive operation in the Spark API: the shuffle operationRe-design your jobs to use reduceByKey instead of groupByCreate robust processing pipelines by testing Apache Spark jobsWho this book is for This book is for developers, data scientists, business analysts, or anyone who needs to reliably analyze large amounts of large-scale, real-world data. Whether you're tasked with creating your company's business intelligence function or creating great data platforms for your machine learning models, or are looking to use code to magnify the impact of your business, this book is for you.
A Hands-on Introduction to Big Data Analytics
Title | A Hands-on Introduction to Big Data Analytics PDF eBook |
Author | Funmi Obembe |
Publisher | SAGE Publications Limited |
Pages | 415 |
Release | 2024-02-23 |
Genre | Business & Economics |
ISBN | 1529615909 |
This practical textbook offers a hands-on introduction to big data analytics, helping you to develop the skills required to hit the ground running as a data professional. It complements theoretical foundations with an emphasis on the application of big data analytics, illustrated by real-life examples and datasets. Containing comprehensive coverage of all the key topics in this area, this book uses open-source technologies and examples in Python and Apache Spark. Learning features include: - Ethics by Design encourages you to consider data ethics at every stage. - Industry Insights facilitate a deeper understanding of the link between what you are studying and how it is applied in industry. - Datasets, questions, and exercises give you the opportunity to apply your learning. Dr Funmi Obembe is the Head of Technology at the Faculty of Arts, Science and Technology, University of Northampton. Dr Ofer Engel is a Data Scientist at the University of Groningen.
Python Data Science
Title | Python Data Science PDF eBook |
Author | Computer Programming Academy |
Publisher | |
Pages | 202 |
Release | 2020-11-10 |
Genre | |
ISBN | 9781914185106 |
Inside this book you will find all the basic notions to start with Python and all the programming concepts to implement predictive analytics. With our proven strategies you will write efficient Python codes in less than a week!
Hands-On Big Data Modeling
Title | Hands-On Big Data Modeling PDF eBook |
Author | James Lee |
Publisher | Packt Publishing Ltd |
Pages | 293 |
Release | 2018-11-30 |
Genre | Computers |
ISBN | 1788626087 |
Solve all big data problems by learning how to create efficient data models Key FeaturesCreate effective models that get the most out of big dataApply your knowledge to datasets from Twitter and weather data to learn big dataTackle different data modeling challenges with expert techniques presented in this bookBook Description Modeling and managing data is a central focus of all big data projects. In fact, a database is considered to be effective only if you have a logical and sophisticated data model. This book will help you develop practical skills in modeling your own big data projects and improve the performance of analytical queries for your specific business requirements. To start with, you’ll get a quick introduction to big data and understand the different data modeling and data management platforms for big data. Then you’ll work with structured and semi-structured data with the help of real-life examples. Once you’ve got to grips with the basics, you’ll use the SQL Developer Data Modeler to create your own data models containing different file types such as CSV, XML, and JSON. You’ll also learn to create graph data models and explore data modeling with streaming data using real-world datasets. By the end of this book, you’ll be able to design and develop efficient data models for varying data sizes easily and efficiently. What you will learnGet insights into big data and discover various data modelsExplore conceptual, logical, and big data modelsUnderstand how to model data containing different file typesRun through data modeling with examples of Twitter, Bitcoin, IMDB and weather data modelingCreate data models such as Graph Data and Vector SpaceModel structured and unstructured data using Python and RWho this book is for This book is great for programmers, geologists, biologists, and every professional who deals with spatial data. If you want to learn how to handle GIS, GPS, and remote sensing data, then this book is for you. Basic knowledge of R and QGIS would be helpful.