Indexing XML Data for Efficient Twig Pattern Matching

Indexing XML Data for Efficient Twig Pattern Matching
Title Indexing XML Data for Efficient Twig Pattern Matching PDF eBook
Author Praveen Rao
Publisher
Pages 316
Release
Genre
ISBN

Download Indexing XML Data for Efficient Twig Pattern Matching Book in PDF, Epub and Kindle

The Extensible Markup Language XML has become the de facto standard for information representation and interchange on the Internet. In this dissertation, I address the problem of indexing and querying XML in two environments, namely, (a) a traditional environment where data is centrally stored and (b) a growingly popular peer-to-peer (P2P) environment. In a traditional environment, the index built over XML data is typicallycentralized. On the other hand, due to the distributed nature of the data in a P2P system, the index is also distributed. Due to the different models of storing data in these two environments, I propose two different XML indexing schemes for efficient query processing. In a traditional environment, a core operation is tofind all occurrences of a given query pattern in the database. I propose a new way of indexing XML documents and processing query patterns. Every XML document in the database is transformed into a sequence of labels by Prơ̧fer's method that constructs a one-to-one correspondence between trees and sequences. During query processing, a query pattern is also transformed into its Prơ̧fer sequence. By performing subsequence matching on the set of sequences in the database, and performing a series of refinement phasesthat I have developed, all the occurrences of a query pattern can be found in the database. Furthermore, I show that all correct answers are found without any false dismissals or false alarms. I present the design, implementation, and experimental evaluation of the PRIX system that I have developed for this purpose. Coupled with the growing popularity of P2P systems, XML is commonly used as an underlying data model for P2P applications to handle the heterogeneity of the data and limited expressiveness of queries. Locating relevant data sources across a large number of participating peers is an important challenge. In this environment, the challenge is to quickly test the existence ofa query pattern in XML documents published by usersrather than finding all their occurrences. PRIX finds all occurrences of a query pattern and hence is not the best solution. Moreover, in a P2P environment, a distributed and decentralized index is necessary. Therefore, I propose a distributed indexing scheme for XML documents to quickly test for existence of query patterns based on polynomial signatures. In this scheme, each XML document is mapped into an algebraic signature that captures the structural summary of the document. The participating peers in the network collectively maintain a distributed and hierarchical index over the signatures. By virtue of the signature index, the signatures of documents with similar structural characteristics tend to be stored together at the same peer, and a search for document sources is resolved quickly. I present the design, implementation, and empirical evaluation of the psiX system that I have developed for this purpose. The signature scheme proposed in psiX can be applied to querying heterogeneous XML databases.

Database and XML Technologies

Database and XML Technologies
Title Database and XML Technologies PDF eBook
Author Mong Li Lee
Publisher Springer Science & Business Media
Pages 163
Release 2010-09
Genre Computers
ISBN 3642156835

Download Database and XML Technologies Book in PDF, Epub and Kindle

This book constitutes the refereed proceedings of the 7th International XML Database Symposium, XSym 2010, held in Singapore, in September 2010. The 11 papers were carefully reviewed and selected from 20 submissions. The papers are organized in topical sections on XML query processing; XML update and applications; and XML modeling.

Database Systems for Advanced Applications

Database Systems for Advanced Applications
Title Database Systems for Advanced Applications PDF eBook
Author Kian Lee Tan
Publisher Springer
Pages 940
Release 2006-03-11
Genre Computers
ISBN 354033338X

Download Database Systems for Advanced Applications Book in PDF, Epub and Kindle

This book constitutes the refereed proceedings of the 11th International Conference on Database Systems for Advanced Applications, DASFAA 2006, held in Singapore in April 2006. 46 revised full papers and 16 revised short papers presented were carefully reviewed and selected from 188 submissions. Topics include sensor networks, subsequence matching and repeating patterns, spatial-temporal databases, data mining, XML compression and indexing, xpath query evaluation, uncertainty and streams, peer-to-peer and distributed networks and more.

Database Systems for Advanced Applications

Database Systems for Advanced Applications
Title Database Systems for Advanced Applications PDF eBook
Author Masatoshi Yoshikawa
Publisher Springer
Pages 489
Release 2010-08-17
Genre Computers
ISBN 3642145892

Download Database Systems for Advanced Applications Book in PDF, Epub and Kindle

This book constitutes the workshop proceedings of the 15th International Conference on Database Systems for Advanced Applications, DASFAA 2010, held in Tsukuba, Japan, in April 2010. The volume contains six workshops, each focusing on specific research issues that contribute to the main themes of the DASFAA conference: The First International Workshop on Graph Data Management: Techniques and Applications (GDM 2010), The Second International Workshop on Benchmarking of Database Management Systems and Data-Oriented Web Technologies (BenchmarkX'10); The Third International Workshop on Managing Data Quality in Collaborative Information Systems (MCIS2010), The Workshop on Social Networks and Social Media Mining on the Web (SNSMW2010), The Data Intensive eScience Workshop (DIEW 2010), and The Second International Workshop on Ubiquitous Data Management (UDM2010).

Managing and Mining Graph Data

Managing and Mining Graph Data
Title Managing and Mining Graph Data PDF eBook
Author Charu C. Aggarwal
Publisher Springer Science & Business Media
Pages 623
Release 2010-02-02
Genre Computers
ISBN 1441960457

Download Managing and Mining Graph Data Book in PDF, Epub and Kindle

Managing and Mining Graph Data is a comprehensive survey book in graph management and mining. It contains extensive surveys on a variety of important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy. It also studies a number of domain-specific scenarios such as stream mining, web graphs, social networks, chemical and biological data. The chapters are written by well known researchers in the field, and provide a broad perspective of the area. This is the first comprehensive survey book in the emerging topic of graph data processing. Managing and Mining Graph Data is designed for a varied audience composed of professors, researchers and practitioners in industry. This volume is also suitable as a reference book for advanced-level database students in computer science and engineering.

Combinatorial Pattern Matching

Combinatorial Pattern Matching
Title Combinatorial Pattern Matching PDF eBook
Author Bin Ma
Publisher Springer Science & Business Media
Pages 377
Release 2007-06-22
Genre Computers
ISBN 3540734368

Download Combinatorial Pattern Matching Book in PDF, Epub and Kindle

This volume features select refereed proceedings from the 18th Annual Symposium on Combinatorial Pattern Matching. Collectively, the papers provide great insights into the most recent advances in combinatorial pattern matching. They are organized into topical sections covering algorithmic techniques, approximate pattern matching, data compression, computational biology, pattern analysis, suffix arrays and trees, and algorithmic techniques.

Journal on Data Semantics XV

Journal on Data Semantics XV
Title Journal on Data Semantics XV PDF eBook
Author Stefano Spaccapietra
Publisher Springer Science & Business Media
Pages 205
Release 2011-08-09
Genre Computers
ISBN 3642226299

Download Journal on Data Semantics XV Book in PDF, Epub and Kindle

The LNCS Journal on Data Semantics is devoted to the presentation of notable work that, in one way or another, addresses research and development on issues related to data semantics. The scope of the journal ranges from theories supporting the formal definition of semantic content to innovative domain-specific applications of semantic knowledge. The journal addresses researchers and advanced practitioners working on the semantic web, interoperability, mobile information services, data warehousing, knowledge representation and reasoning, conceptual database modeling, ontologies, and artificial intelligence. Volume XV results from a rigorous selection among 25 full papers received in response to two calls for contributions issued in 2009 and 2010. In addition, this volume contains a special report on the Ontology Alignment Evaluation Initiative, an event that has been held once a year in the last five years and has attracted considerable attention from the ontology community. This is the last LNCS transactions volume of the Journal on Data Semantics; the next issue will appear as a regular Springer Journal, published quarterly starting from 2012.