A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years
Title | A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years PDF eBook |
Author | Sergio Flesca |
Publisher | Springer |
Pages | 490 |
Release | 2017-05-29 |
Genre | Technology & Engineering |
ISBN | 3319618938 |
This book offers readers a comprehensive guide to the evolution of the database field from its earliest stages up to the present—and from classical relational database management systems to the current Big Data metaphor. In particular, it gathers the most significant research from the Italian database community that had relevant intersections with international projects. Big Data technology is currently dominating both the market and research. The book provides readers with a broad overview of key research efforts in modelling, querying and analysing data, which, over the last few decades, have became massive and heterogeneous areas.
Data Stream Management
Title | Data Stream Management PDF eBook |
Author | Lukasz Golab |
Publisher | Morgan & Claypool Publishers |
Pages | 65 |
Release | 2010 |
Genre | Computers |
ISBN | 1608452727 |
In this lecture many applications process high volumes of streaming data, among them Internet traffic analysis, financial tickers, and transaction log mining. In general, a data stream is an unbounded data set that is produced incrementally over time, rather than being available in full before its processing begins. In this lecture, we give an overview of recent research in stream processing, ranging from answering simple queries on high-speed streams to loading real-time data feeds into a streaming warehouse for off-line analysis. We will discuss two types of systems for end-to-end stream processing: Data Stream Management Systems (DSMSs) and Streaming Data Warehouses (SDWs). A traditional database management system typically processes a stream of ad-hoc queries over relatively static data. In contrast, a DSMS evaluates static (long-running) queries on streaming data, making a single pass over the data and using limited working memory. In the first part of this lecture, we will discuss research problems in DSMSs, such as continuous query languages, non-blocking query operators that continually react to new data, and continuous query optimization. The second part covers SDWs, which combine the real-time response of a DSMS by loading new data as soon as they arrive with a data warehouse's ability to manage Terabytes of historical data on secondary storage. Table of Contents: Introduction / Data Stream Management Systems / Streaming Data Warehouses / Conclusions
Kafka: The Definitive Guide
Title | Kafka: The Definitive Guide PDF eBook |
Author | Neha Narkhede |
Publisher | "O'Reilly Media, Inc." |
Pages | 315 |
Release | 2017-08-31 |
Genre | Computers |
ISBN | 1491936118 |
Every enterprise application creates data, whether it’s log messages, metrics, user activity, outgoing messages, or something else. And how to move all of this data becomes nearly as important as the data itself. If you’re an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds. Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you’ll learn Kafka’s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer. Understand publish-subscribe messaging and how it fits in the big data ecosystem. Explore Kafka producers and consumers for writing and reading messages Understand Kafka patterns and use-case requirements to ensure reliable data delivery Get best practices for building data pipelines and applications with Kafka Manage Kafka in production, and learn to perform monitoring, tuning, and maintenance tasks Learn the most critical metrics among Kafka’s operational measurements Explore how Kafka’s stream delivery capabilities make it a perfect source for stream processing systems
Data Streams
Title | Data Streams PDF eBook |
Author | S. Muthukrishnan |
Publisher | Now Publishers Inc |
Pages | 136 |
Release | 2005 |
Genre | Computers |
ISBN | 193301914X |
In the data stream scenario, input arrives very rapidly and there is limited memory to store the input. Algorithms have to work with one or few passes over the data, space less than linear in the input size or time significantly less than the input size. In the past few years, a new theory has emerged for reasoning about algorithms that work within these constraints on space, time, and number of passes. Some of the methods rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity. The applications for this scenario include IP network traffic analysis, mining text message streams and processing massive data sets in general. Researchers in Theoretical Computer Science, Databases, IP Networking and Computer Systems are working on the data stream challenges.
Computing Handbook
Title | Computing Handbook PDF eBook |
Author | Heikki Topi |
Publisher | CRC Press |
Pages | 1524 |
Release | 2014-05-14 |
Genre | Computers |
ISBN | 1439898561 |
The second volume of this popular handbook demonstrates the richness and breadth of the IS and IT disciplines. The book explores their close links to the practice of using, managing, and developing IT-based solutions to advance the goals of modern organizational environments. Established leading experts and influential young researchers present introductions to the current status and future directions of research and give in-depth perspectives on the contributions of academic research to the practice of IS and IT development, use, and management.
Data Mining: Concepts and Techniques
Title | Data Mining: Concepts and Techniques PDF eBook |
Author | Jiawei Han |
Publisher | Elsevier |
Pages | 740 |
Release | 2011-06-09 |
Genre | Computers |
ISBN | 0123814804 |
Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifically, it explains data mining and the tools used in discovering knowledge from the collected data. This book is referred as the knowledge discovery from data (KDD). It focuses on the feasibility, usefulness, effectiveness, and scalability of techniques of large data sets. After describing data mining, this edition explains the methods of knowing, preprocessing, processing, and warehousing data. It then presents information about data warehouses, online analytical processing (OLAP), and data cube technology. Then, the methods involved in mining frequent patterns, associations, and correlations for large data sets are described. The book details the methods for data classification and introduces the concepts and methods for data clustering. The remaining chapters discuss the outlier detection and the trends, applications, and research frontiers in data mining. This book is intended for Computer Science students, application developers, business professionals, and researchers who seek information on data mining. - Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects - Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields - Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of your data
Complete Guide to Open Source Big Data Stack
Title | Complete Guide to Open Source Big Data Stack PDF eBook |
Author | Michael Frampton |
Publisher | Apress |
Pages | 375 |
Release | 2018-01-18 |
Genre | Computers |
ISBN | 1484221494 |
See a Mesos-based big data stack created and the components used. You will use currently available Apache full and incubating systems. The components are introduced by example and you learn how they work together. In the Complete Guide to Open Source Big Data Stack, the author begins by creating a private cloud and then installs and examines Apache Brooklyn. After that, he uses each chapter to introduce one piece of the big data stack—sharing how to source the software and how to install it. You learn by simple example, step by step and chapter by chapter, as a real big data stack is created. The book concentrates on Apache-based systems and shares detailed examples of cloud storage, release management, resource management, processing, queuing, frameworks, data visualization, and more. What You’ll Learn Install a private cloud onto the local cluster using Apache cloud stack Source, install, and configure Apache: Brooklyn, Mesos, Kafka, and Zeppelin See how Brooklyn can be used to install Mule ESB on a cluster and Cassandra in the cloud Install and use DCOS for big data processing Use Apache Spark for big data stack data processing Who This Book Is For Developers, architects, IT project managers, database administrators, and others charged with developing or supporting a big data system. It is also for anyone interested in Hadoop or big data, and those experiencing problems with data size.