R for Data Science
Title | R for Data Science PDF eBook |
Author | Hadley Wickham |
Publisher | "O'Reilly Media, Inc." |
Pages | 521 |
Release | 2016-12-12 |
Genre | Computers |
ISBN | 1491910364 |
Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You'll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you've learned along the way. You'll learn how to: Wrangle—transform your datasets into a form convenient for analysis Program—learn powerful R tools for solving data problems with greater clarity and ease Explore—examine your data, generate hypotheses, and quickly test them Model—provide a low-dimensional summary that captures true "signals" in your dataset Communicate—learn R Markdown for integrating prose, code, and results
Best Practices in Data Cleaning
Title | Best Practices in Data Cleaning PDF eBook |
Author | Jason W. Osborne |
Publisher | SAGE |
Pages | 297 |
Release | 2013 |
Genre | Mathematics |
ISBN | 1412988012 |
Many researchers jump straight from data collection to data analysis without realizing how analyses and hypothesis tests can go profoundly wrong without clean data. This book provides a clear, step-by-step process of examining and cleaning data in order to decrease error rates and increase both the power and replicability of results. Jason W. Osborne, author of Best Practices in Quantitative Methods (SAGE, 2008) provides easily-implemented suggestions that are research-based and will motivate change in practice by empirically demonstrating, for each topic, the benefits of following best practices and the potential consequences of not following these guidelines. If your goal is to do the best research you can do, draw conclusions that are most likely to be accurate representations of the population(s) you wish to speak about, and report results that are most likely to be replicated by other researchers, then this basic guidebook will be indispensible.
A Beginner’s Guide to Using Open Access Data
Title | A Beginner’s Guide to Using Open Access Data PDF eBook |
Author | Saif Aldeen Saleh AlRyalat |
Publisher | CRC Press |
Pages | 110 |
Release | 2019-02-12 |
Genre | Computers |
ISBN | 0429667671 |
Open Access Data is emerging as a source for cutting edge scholarship. This concise book provides guidance from generating a research idea to publishing results. Both young researchers and well-established scholars can use this book to upgrade their skills with respect to emerging data sources, analysis, and even post-publishing promotion. At the end of each chapter, a tutorial simulates a real example, allowing readers to apply what they learned about accessing open data, and analyzing this data to reach the results. This book can be of use by established researchers analyzing data, publishing, and actively promoting ongoing and research. Key selling features: Describes the steps, from A-Z, for doing open data research Includes interactive tutorials following each chapter Provides guidelines for readers so that they can use their own accessed open data Reviews recent software and websites promoting and enabling open data research Supplements websites which update recent open data sources
Mining of Massive Datasets
Title | Mining of Massive Datasets PDF eBook |
Author | Jure Leskovec |
Publisher | Cambridge University Press |
Pages | 480 |
Release | 2014-11-13 |
Genre | Computers |
ISBN | 1107077230 |
Now in its second edition, this book focuses on practical algorithms for mining data from even the largest datasets.
OpenIntro Statistics
Title | OpenIntro Statistics PDF eBook |
Author | David Diez |
Publisher | |
Pages | |
Release | 2015-07-02 |
Genre | |
ISBN | 9781943450046 |
The OpenIntro project was founded in 2009 to improve the quality and availability of education by producing exceptional books and teaching tools that are free to use and easy to modify. We feature real data whenever possible, and files for the entire textbook are freely available at openintro.org. Visit our website, openintro.org. We provide free videos, statistical software labs, lecture slides, course management tools, and many other helpful resources.
Handbook on Using Administrative Data for Research and Evidence-based Policy
Title | Handbook on Using Administrative Data for Research and Evidence-based Policy PDF eBook |
Author | Shawn Cole |
Publisher | Abdul Latif Jameel Poverty Action Lab |
Pages | 618 |
Release | 2021 |
Genre | |
ISBN | 9781736021606 |
This Handbook intends to inform Data Providers and researchers on how to provide privacy-protected access to, handle, and analyze administrative data, and to link them with existing resources, such as a database of data use agreements (DUA) and templates. Available publicly, the Handbook will provide guidance on data access requirements and procedures, data privacy, data security, property rights, regulations for public data use, data architecture, data use and storage, cost structure and recovery, ethics and privacy-protection, making data accessible for research, and dissemination for restricted access use. The knowledge base will serve as a resource for all researchers looking to work with administrative data and for Data Providers looking to make such data available.
The State of Open Data
Title | The State of Open Data PDF eBook |
Author | Davies, Tim |
Publisher | African Minds |
Pages | 592 |
Release | 2019-05-22 |
Genre | Social Science |
ISBN | 1928331955 |
It’s been ten years since open data first broke onto the global stage. Over the past decade, thousands of programmes and projects around the world have worked to open data and use it to address a myriad of social and economic challenges. Meanwhile, issues related to data rights and privacy have moved to the centre of public and political discourse. As the open data movement enters a new phase in its evolution, shifting to target real-world problems and embed open data thinking into other existing or emerging communities of practice, big questions still remain. How will open data initiatives respond to new concerns about privacy, inclusion, and artificial intelligence? And what can we learn from the last decade in order to deliver impact where it is most needed? The State of Open Data brings together over 60 authors from around the world to address these questions and to take stock of the real progress made to date across sectors and around the world, uncovering the issues that will shape the future of open data in the years to come.