Data and Information Quality

Data and Information Quality
Title Data and Information Quality PDF eBook
Author Carlo Batini
Publisher Springer
Pages 520
Release 2016-03-23
Genre Computers
ISBN 3319241060

Download Data and Information Quality Book in PDF, Epub and Kindle

This book provides a systematic and comparative description of the vast number of research issues related to the quality of data and information. It does so by delivering a sound, integrated and comprehensive overview of the state of the art and future development of data and information quality in databases and information systems. To this end, it presents an extensive description of the techniques that constitute the core of data and information quality research, including record linkage (also called object identification), data integration, error localization and correction, and examines the related techniques in a comprehensive and original methodological framework. Quality dimension definitions and adopted models are also analyzed in detail, and differences between the proposed solutions are highlighted and discussed. Furthermore, while systematically describing data and information quality as an autonomous research area, paradigms and influences deriving from other areas, such as probability theory, statistical data analysis, data mining, knowledge representation, and machine learning are also included. Last not least, the book also highlights very practical solutions, such as methodologies, benchmarks for the most effective techniques, case studies, and examples. The book has been written primarily for researchers in the fields of databases and information management or in natural sciences who are interested in investigating properties of data and information that have an impact on the quality of experiments, processes and on real life. The material presented is also sufficiently self-contained for masters or PhD-level courses, and it covers all the fundamentals and topics without the need for other textbooks. Data and information system administrators and practitioners, who deal with systems exposed to data-quality issues and as a result need a systematization of the field and practical methods in the area, will also benefit from the combination of concrete practical approaches with sound theoretical formalisms.

Handbook of Data Quality

Handbook of Data Quality
Title Handbook of Data Quality PDF eBook
Author Shazia Sadiq
Publisher Springer Science & Business Media
Pages 440
Release 2013-08-13
Genre Computers
ISBN 3642362575

Download Handbook of Data Quality Book in PDF, Epub and Kindle

The issue of data quality is as old as data itself. However, the proliferation of diverse, large-scale and often publically available data on the Web has increased the risk of poor data quality and misleading data interpretations. On the other hand, data is now exposed at a much more strategic level e.g. through business intelligence systems, increasing manifold the stakes involved for individuals, corporations as well as government agencies. There, the lack of knowledge about data accuracy, currency or completeness can have erroneous and even catastrophic results. With these changes, traditional approaches to data management in general, and data quality control specifically, are challenged. There is an evident need to incorporate data quality considerations into the whole data cycle, encompassing managerial/governance as well as technical aspects. Data quality experts from research and industry agree that a unified framework for data quality management should bring together organizational, architectural and computational approaches. Accordingly, Sadiq structured this handbook in four parts: Part I is on organizational solutions, i.e. the development of data quality objectives for the organization, and the development of strategies to establish roles, processes, policies, and standards required to manage and ensure data quality. Part II, on architectural solutions, covers the technology landscape required to deploy developed data quality management processes, standards and policies. Part III, on computational solutions, presents effective and efficient tools and techniques related to record linkage, lineage and provenance, data uncertainty, and advanced integrity constraints. Finally, Part IV is devoted to case studies of successful data quality initiatives that highlight the various aspects of data quality in action. The individual chapters present both an overview of the respective topic in terms of historical research and/or practice and state of the art, as well as specific techniques, methodologies and frameworks developed by the individual contributors. Researchers and students of computer science, information systems, or business management as well as data professionals and practitioners will benefit most from this handbook by not only focusing on the various sections relevant to their research area or particular practical work, but by also studying chapters that they may initially consider not to be directly relevant to them, as there they will learn about new perspectives and approaches.

Data Quality

Data Quality
Title Data Quality PDF eBook
Author Jack E. Olson
Publisher Elsevier
Pages 313
Release 2003-01-09
Genre Computers
ISBN 0080503691

Download Data Quality Book in PDF, Epub and Kindle

Data Quality: The Accuracy Dimension is about assessing the quality of corporate data and improving its accuracy using the data profiling method. Corporate data is increasingly important as companies continue to find new ways to use it. Likewise, improving the accuracy of data in information systems is fast becoming a major goal as companies realize how much it affects their bottom line. Data profiling is a new technology that supports and enhances the accuracy of databases throughout major IT shops. Jack Olson explains data profiling and shows how it fits into the larger picture of data quality.* Provides an accessible, enjoyable introduction to the subject of data accuracy, peppered with real-world anecdotes. * Provides a framework for data profiling with a discussion of analytical tools appropriate for assessing data accuracy. * Is written by one of the original developers of data profiling technology. * Is a must-read for any data management staff, IT management staff, and CIOs of companies with data assets.

Data Quality

Data Quality
Title Data Quality PDF eBook
Author Richard Y. Wang
Publisher Springer Science & Business Media
Pages 175
Release 2006-04-11
Genre Computers
ISBN 0306469871

Download Data Quality Book in PDF, Epub and Kindle

Data Quality provides an exposé of research and practice in the data quality field for technically oriented readers. It is based on the research conducted at the MIT Total Data Quality Management (TDQM) program and work from other leading research institutions. This book is intended primarily for researchers, practitioners, educators and graduate students in the fields of Computer Science, Information Technology, and other interdisciplinary areas. It forms a theoretical foundation that is both rigorous and relevant for dealing with advanced issues related to data quality. Written with the goal to provide an overview of the cumulated research results from the MIT TDQM research perspective as it relates to database research, this book is an excellent introduction to Ph.D. who wish to further pursue their research in the data quality area. It is also an excellent theoretical introduction to IT professionals who wish to gain insight into theoretical results in the technically-oriented data quality area, and apply some of the key concepts to their practice.

Information Quality

Information Quality
Title Information Quality PDF eBook
Author Ron S. Kenett
Publisher John Wiley & Sons
Pages 381
Release 2016-12-19
Genre Mathematics
ISBN 1118874447

Download Information Quality Book in PDF, Epub and Kindle

Provides an important framework for data analysts in assessing the quality of data and its potential to provide meaningful insights through analysis Analytics and statistical analysis have become pervasive topics, mainly due to the growing availability of data and analytic tools. Technology, however, fails to deliver insights with added value if the quality of the information it generates is not assured. Information Quality (InfoQ) is a tool developed by the authors to assess the potential of a dataset to achieve a goal of interest, using data analysis. Whether the information quality of a dataset is sufficient is of practical importance at many stages of the data analytics journey, from the pre-data collection stage to the post-data collection and post-analysis stages. It is also critical to various stakeholders: data collection agencies, analysts, data scientists, and management. This book: Explains how to integrate the notions of goal, data, analysis and utility that are the main building blocks of data analysis within any domain. Presents a framework for integrating domain knowledge with data analysis. Provides a combination of both methodological and practical aspects of data analysis. Discusses issues surrounding the implementation and integration of InfoQ in both academic programmes and business / industrial projects. Showcases numerous case studies in a variety of application areas such as education, healthcare, official statistics, risk management and marketing surveys. Presents a review of software tools from the InfoQ perspective along with example datasets on an accompanying website. This book will be beneficial for researchers in academia and in industry, analysts, consultants, and agencies that collect and analyse data as well as undergraduate and postgraduate courses involving data analysis.

Data Quality

Data Quality
Title Data Quality PDF eBook
Author Carlo Batini
Publisher Springer Science & Business Media
Pages 276
Release 2006-09-27
Genre Computers
ISBN 3540331735

Download Data Quality Book in PDF, Epub and Kindle

Poor data quality can seriously hinder or damage the efficiency and effectiveness of organizations and businesses. The growing awareness of such repercussions has led to major public initiatives like the "Data Quality Act" in the USA and the "European 2003/98" directive of the European Parliament. Batini and Scannapieco present a comprehensive and systematic introduction to the wide set of issues related to data quality. They start with a detailed description of different data quality dimensions, like accuracy, completeness, and consistency, and their importance in different types of data, like federated data, web data, or time-dependent data, and in different data categories classified according to frequency of change, like stable, long-term, and frequently changing data. The book's extensive description of techniques and methodologies from core data quality research as well as from related fields like data mining, probability theory, statistical data analysis, and machine learning gives an excellent overview of the current state of the art. The presentation is completed by a short description and critical comparison of tools and practical methodologies, which will help readers to resolve their own quality problems. This book is an ideal combination of the soundness of theoretical foundations and the applicability of practical approaches. It is ideally suited for everyone – researchers, students, or professionals – interested in a comprehensive overview of data quality issues. In addition, it will serve as the basis for an introductory course or for self-study on this topic.

Executing Data Quality Projects

Executing Data Quality Projects
Title Executing Data Quality Projects PDF eBook
Author Danette McGilvray
Publisher Academic Press
Pages 378
Release 2021-05-27
Genre Computers
ISBN 0128180161

Download Executing Data Quality Projects Book in PDF, Epub and Kindle

Executing Data Quality Projects, Second Edition presents a structured yet flexible approach for creating, improving, sustaining and managing the quality of data and information within any organization. Studies show that data quality problems are costing businesses billions of dollars each year, with poor data linked to waste and inefficiency, damaged credibility among customers and suppliers, and an organizational inability to make sound decisions. Help is here! This book describes a proven Ten Step approach that combines a conceptual framework for understanding information quality with techniques, tools, and instructions for practically putting the approach to work – with the end result of high-quality trusted data and information, so critical to today's data-dependent organizations. The Ten Steps approach applies to all types of data and all types of organizations – for-profit in any industry, non-profit, government, education, healthcare, science, research, and medicine. This book includes numerous templates, detailed examples, and practical advice for executing every step. At the same time, readers are advised on how to select relevant steps and apply them in different ways to best address the many situations they will face. The layout allows for quick reference with an easy-to-use format highlighting key concepts and definitions, important checkpoints, communication activities, best practices, and warnings. The experience of actual clients and users of the Ten Steps provide real examples of outputs for the steps plus highlighted, sidebar case studies called Ten Steps in Action. This book uses projects as the vehicle for data quality work and the word broadly to include: 1) focused data quality improvement projects, such as improving data used in supply chain management, 2) data quality activities in other projects such as building new applications and migrating data from legacy systems, integrating data because of mergers and acquisitions, or untangling data due to organizational breakups, and 3) ad hoc use of data quality steps, techniques, or activities in the course of daily work. The Ten Steps approach can also be used to enrich an organization's standard SDLC (whether sequential or Agile) and it complements general improvement methodologies such as six sigma or lean. No two data quality projects are the same but the flexible nature of the Ten Steps means the methodology can be applied to all. The new Second Edition highlights topics such as artificial intelligence and machine learning, Internet of Things, security and privacy, analytics, legal and regulatory requirements, data science, big data, data lakes, and cloud computing, among others, to show their dependence on data and information and why data quality is more relevant and critical now than ever before. - Includes concrete instructions, numerous templates, and practical advice for executing every step of The Ten Steps approach - Contains real examples from around the world, gleaned from the author's consulting practice and from those who implemented based on her training courses and the earlier edition of the book - Allows for quick reference with an easy-to-use format highlighting key concepts and definitions, important checkpoints, communication activities, and best practices - A companion Web site includes links to numerous data quality resources, including many of the templates featured in the text, quick summaries of key ideas from the Ten Steps methodology, and other tools and information that are available online