IBM Spectrum Discover: Metadata Management for Deep Insight of Unstructured Storage
Title | IBM Spectrum Discover: Metadata Management for Deep Insight of Unstructured Storage PDF eBook |
Author | Joseph Dain |
Publisher | IBM Redbooks |
Pages | 152 |
Release | 2019-10-01 |
Genre | Computers |
ISBN | 0738457868 |
This IBM® Redpaper publication provides a comprehensive overview of the IBM Spectrum® Discover metadata management software platform. We give a detailed explanation of how the product creates, collects, and analyzes metadata. Several in-depth use cases are used that show examples of analytics, governance, and optimization. We also provide step-by-step information to install and set up the IBM Spectrum Discover trial environment. More than 80% of all data that is collected by organizations is not in a standard relational database. Instead, it is trapped in unstructured documents, social media posts, machine logs, and so on. Many organizations face significant challenges to manage this deluge of unstructured data such as: Pinpointing and activating relevant data for large-scale analytics Lacking the fine-grained visibility that is needed to map data to business priorities Removing redundant, obsolete, and trivial (ROT) data Identifying and classifying sensitive data IBM Spectrum Discover is a modern metadata management software that provides data insight for petabyte-scale file and Object Storage, storage on premises, and in the cloud. This software enables organizations to make better business decisions and gain and maintain a competitive advantage. IBM Spectrum Discover provides a rich metadata layer that enables storage administrators, data stewards, and data scientists to efficiently manage, classify, and gain insights from massive amounts of unstructured data. It improves storage economics, helps mitigate risk, and accelerates large-scale analytics to create competitive advantage and speed critical research.
IBM Spectrum Discover
Title | IBM Spectrum Discover PDF eBook |
Author | Joe Dain |
Publisher | |
Pages | |
Release | 2019 |
Genre | Database management |
ISBN |
Making Data Smarter with IBM Spectrum Discover: Practical AI Solutions
Title | Making Data Smarter with IBM Spectrum Discover: Practical AI Solutions PDF eBook |
Author | Ivaylo B. Bozhinov |
Publisher | IBM Redbooks |
Pages | 170 |
Release | 2020-10-19 |
Genre | Computers |
ISBN | 0738459135 |
More than 80% of all data that is collected by organizations is not in a standard relational database. Instead, it is trapped in unstructured documents, social media posts, machine logs, and so on. Many organizations face significant challenges to manage this deluge of unstructured data, such as the following examples: Pinpointing and activating relevant data for large-scale analytics Lacking the fine-grained visibility that is needed to map data to business priorities Removing redundant, obsolete, and trivial (ROT) data Identifying and classifying sensitive data IBM® Spectrum Discover is a modern metadata management software that provides data insight for petabyte-scale file and Object Storage, storage on-premises, and in the cloud. This software enables organizations to make better business decisions and gain and maintain a competitive advantage. IBM Spectrum® Discover provides a rich metadata layer that enables storage administrators, data stewards, and data scientists to efficiently manage, classify, and gain insights from massive amounts of unstructured data. It improves storage economics, helps mitigate risk, and accelerates large-scale analytics to create competitive advantage and speed critical research. This IBM Redbooks® publication presents several use cases that are focused on artificial intelligence (AI) solutions with IBM Spectrum Discover. This book helps storage administrators and technical specialists plan and implement AI solutions by using IBM Spectrum Discover and several other IBM Storage products.
Cataloging Unstructured Data in IBM Watson Knowledge Catalog with IBM Spectrum Discover
Title | Cataloging Unstructured Data in IBM Watson Knowledge Catalog with IBM Spectrum Discover PDF eBook |
Author | Joseph Dain |
Publisher | IBM Redbooks |
Pages | 108 |
Release | 2020-08-11 |
Genre | Computers |
ISBN | 073845902X |
This IBM® Redpaper publication explains how IBM Spectrum® Discover integrates with the IBM Watson® Knowledge Catalog (WKC) component of IBM Cloud® Pak for Data (IBM CP4D) to make the enriched catalog content in IBM Spectrum Discover along with the associated data available in WKC and IBM CP4D. From an end-to-end IBM solution point of view, IBM CP4D and WKC provide state-of-the-art data governance, collaboration, and artificial intelligence (AI) and analytics tools, and IBM Spectrum Discover complements these features by adding support for unstructured data on large-scale file and object storage systems on premises and in the cloud. Many organizations face challenges to manage unstructured data. Some challenges that companies face include: Pinpointing and activating relevant data for large-scale analytics, machine learning (ML) and deep learning (DL) workloads. Lacking the fine-grained visibility that is needed to map data to business priorities. Removing redundant, obsolete, and trivial (ROT) data and identifying data that can be moved to a lower-cost storage tier. Identifying and classifying sensitive data as it relates to various compliance mandates, such as the General Data Privacy Regulation (GDPR), Payment Card Industry Data Security Standards (PCI-DSS), and the Health Information Portability and Accountability Act (HIPAA). This paper describes how IBM Spectrum Discover provides seamless integration of data in IBM Storage with IBM Watson Knowledge Catalog (WKC). Features include: Event-based cataloging and tagging of unstructured data across the enterprise. Automatically inspecting and classifying over 1000 unstructured data types, including genomics and imaging specific file formats. Automatically registering assets with WKC based on IBM Spectrum Discover search and filter criteria, and by using assets in IBM CP4D. Enforcing data governance policies in WKC in IBM CP4D based on insights from IBM Spectrum Discover, and using assets in IBM CP4D. Several in-depth use cases are used that show examples of healthcare, life sciences, and financial services. IBM Spectrum Discover integration with WKC enables storage administrators, data stewards, and data scientists to efficiently manage, classify, and gain insights from massive amounts of data. The integration improves storage economics, helps mitigate risk, and accelerates large-scale analytics to create competitive advantage and speed critical research.
Cataloging Unstructured Data in IBM Watson Knowledge Catalog with IBM Spectrum Discover
Title | Cataloging Unstructured Data in IBM Watson Knowledge Catalog with IBM Spectrum Discover PDF eBook |
Author | Joseph Dain |
Publisher | |
Pages | |
Release | 2020 |
Genre | Database management |
ISBN |
IBM Reference Architecture for High Performance Data and AI in Healthcare and Life Sciences
Title | IBM Reference Architecture for High Performance Data and AI in Healthcare and Life Sciences PDF eBook |
Author | Dino Quintero |
Publisher | IBM Redbooks |
Pages | 88 |
Release | 2019-09-08 |
Genre | Computers |
ISBN | 073845690X |
This IBM® Redpaper publication provides an update to the original description of IBM Reference Architecture for Genomics. This paper expands the reference architecture to cover all of the major vertical areas of healthcare and life sciences industries, such as genomics, imaging, and clinical and translational research. The architecture was renamed IBM Reference Architecture for High Performance Data and AI in Healthcare and Life Sciences to reflect the fact that it incorporates key building blocks for high-performance computing (HPC) and software-defined storage, and that it supports an expanding infrastructure of leading industry partners, platforms, and frameworks. The reference architecture defines a highly flexible, scalable, and cost-effective platform for accessing, managing, storing, sharing, integrating, and analyzing big data, which can be deployed on-premises, in the cloud, or as a hybrid of the two. IT organizations can use the reference architecture as a high-level guide for overcoming data management challenges and processing bottlenecks that are frequently encountered in personalized healthcare initiatives, and in compute-intensive and data-intensive biomedical workloads. This reference architecture also provides a framework and context for modern healthcare and life sciences institutions to adopt cutting-edge technologies, such as cognitive life sciences solutions, machine learning and deep learning, Spark for analytics, and cloud computing. To illustrate these points, this paper includes case studies describing how clients and IBM Business Partners alike used the reference architecture in the deployments of demanding infrastructures for precision medicine. This publication targets technical professionals (consultants, technical support staff, IT Architects, and IT Specialists) who are responsible for providing life sciences solutions and support.
HIPAA Compliance for Healthcare Workloads on IBM Spectrum Scale
Title | HIPAA Compliance for Healthcare Workloads on IBM Spectrum Scale PDF eBook |
Author | Sandeep R. Patil |
Publisher | IBM Redbooks |
Pages | 18 |
Release | 2020-03-16 |
Genre | Computers |
ISBN | 0738458600 |
When technology workloads process healthcare data, it is important to understand Health Insurance Portability and Accountability Act (HIPAA) compliance and what it means for the technology infrastructure in general and storage in particular. HIPAA is US legislation that was signed into law in 1996. HIPAA was enacted to protect health insurance coverage, but was later extended to ensure protection and privacy of electronic health records and transactions. In simple terms, it was instituted to modernize the exchange of healthcare information and how the Personally Identifiable Information (PII) that is maintained by the healthcare and healthcare-related industries are safeguarded. From a technology perspective, one of the core requirements of HIPAA is the protection of Electronic Protected Health Information (ePHIPer through physical, technical, and administrative defenses. From a non-compliance perspective, the Health Information Technology for Economic and Clinical Health Act (HITECH) added protections to HIPAA and increased penalties $100 USD - $50,000 USD per violation. Today, HIPAA-compliant solutions are a norm in the healthcare industry worldwide. This IBM® Redpaper publication describes HIPPA compliance requirements for storage and how security enhanced software-defined storage is designed to help meet those requirements. We correlate how Software Defined IBM Spectrum® Scale security features address the safeguards that are specified by the HIPAA Security Rule.