台語文處理技術:以變調及詞性標記為例 Processing Techniques for Written Taiwanese -- Tone Sandhi and POS Tagging

台語文處理技術:以變調及詞性標記為例 Processing Techniques for Written Taiwanese -- Tone Sandhi and POS Tagging
Title 台語文處理技術:以變調及詞性標記為例 Processing Techniques for Written Taiwanese -- Tone Sandhi and POS Tagging PDF eBook
Author
Publisher Ungian Iunn 楊允言
Pages 167
Release
Genre
ISBN

Download 台語文處理技術:以變調及詞性標記為例 Processing Techniques for Written Taiwanese -- Tone Sandhi and POS Tagging Book in PDF, Epub and Kindle

Processing Techniques for Written Taiwanese

Processing Techniques for Written Taiwanese
Title Processing Techniques for Written Taiwanese PDF eBook
Author 楊允言
Publisher
Pages 139
Release 2009
Genre
ISBN

Download Processing Techniques for Written Taiwanese Book in PDF, Epub and Kindle

Chinese Spoken Language Processing

Chinese Spoken Language Processing
Title Chinese Spoken Language Processing PDF eBook
Author Qiang Huo
Publisher Springer
Pages 825
Release 2006-11-30
Genre Computers
ISBN 3540496661

Download Chinese Spoken Language Processing Book in PDF, Epub and Kindle

This book constitutes the thoroughly refereed proceedings of the 5th International Symposium on Chinese Spoken Language Processing, ISCSLP 2006, held in Singapore in December 2006, co-located with ICCPOL 2006, the 21st International Conference on Computer Processing of Oriental Languages. Coverage includes speech science, acoustic modeling for automatic speech recognition, speech data mining, and machine translation of speech.

Weakly Supervised Part-of-speech Tagging for Chinese Using Label Propagation

Weakly Supervised Part-of-speech Tagging for Chinese Using Label Propagation
Title Weakly Supervised Part-of-speech Tagging for Chinese Using Label Propagation PDF eBook
Author Weiwei Ding
Publisher
Pages 118
Release 2011
Genre
ISBN

Download Weakly Supervised Part-of-speech Tagging for Chinese Using Label Propagation Book in PDF, Epub and Kindle

Part-of-speech (POS) tagging is one of the most fundamental and crucial tasks in Natural Language Processing. Chinese POS tagging is challenging because it also involves word segmentation. In this report, research will be focused on how to improve unsupervised Part-of-Speech (POS) tagging using Hidden Markov Models and the Expectation Maximization parameter estimation approach (EM-HMM). The traditional EM-HMM system uses a dictionary, which is used to constrain possible tag sequences and initialize the model parameters. This is a very crude initialization: the emission parameters are set uniformly in accordance with the tag dictionary. To improve this, word alignments can be used. Word alignments are the word-level translation correspondent pairs generated from parallel text between two languages. In this report, Chinese-English word alignment is used. The performance is expected to be better, as these two tasks are complementary to each other. The dictionary provides information on word types, while word alignment provides information on word tokens. However, it is found to be of limited benefit. In this report, another method is proposed. To improve the dictionary coverage and get better POS distribution, Modified Adsorption, a label propagation algorithm is used. We construct a graph connecting word tokens to feature types (such as word unigrams and bigrams) and connecting those tokens to information from knowledge sources, such as a small tag dictionary, Wiktionary, and word alignments. The core idea is to use a small amount of supervision, in the form of a tag dictionary and acquire POS distributions for each word (both known and unknown) and provide this as an improved initialization for EM learning for HMM. We find this strategy to work very well, especially when we have a small tag dictionary. Label propagation provides a better initialization for the EM-HMM method, because it greatly increases the coverage of the dictionary. In addition, label propagation is quite flexible to incorporate many kinds of knowledge. However, results also show that some resources, such as the word alignments, are not easily exploited with label propagation.

Developing Linguistic Corpora

Developing Linguistic Corpora
Title Developing Linguistic Corpora PDF eBook
Author Martin Wynne
Publisher Oxbow Books Limited
Pages 100
Release 2005
Genre Language Arts & Disciplines
ISBN

Download Developing Linguistic Corpora Book in PDF, Epub and Kindle

A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.

Handbook of Natural Language Processing

Handbook of Natural Language Processing
Title Handbook of Natural Language Processing PDF eBook
Author Nitin Indurkhya
Publisher CRC Press
Pages 704
Release 2010-02-22
Genre Business & Economics
ISBN 142008593X

Download Handbook of Natural Language Processing Book in PDF, Epub and Kindle

The Handbook of Natural Language Processing, Second Edition presents practical tools and techniques for implementing natural language processing in computer systems. Along with removing outdated material, this edition updates every chapter and expands the content to include emerging areas, such as sentiment analysis.New to the Second EditionGreater

Introduction to Embedded Systems, Second Edition

Introduction to Embedded Systems, Second Edition
Title Introduction to Embedded Systems, Second Edition PDF eBook
Author Edward Ashford Lee
Publisher MIT Press
Pages 562
Release 2017-01-06
Genre Computers
ISBN 0262340526

Download Introduction to Embedded Systems, Second Edition Book in PDF, Epub and Kindle

An introduction to the engineering principles of embedded systems, with a focus on modeling, design, and analysis of cyber-physical systems. The most visible use of computers and software is processing information for human consumption. The vast majority of computers in use, however, are much less visible. They run the engine, brakes, seatbelts, airbag, and audio system in your car. They digitally encode your voice and construct a radio signal to send it from your cell phone to a base station. They command robots on a factory floor, power generation in a power plant, processes in a chemical plant, and traffic lights in a city. These less visible computers are called embedded systems, and the software they run is called embedded software. The principal challenges in designing and analyzing embedded systems stem from their interaction with physical processes. This book takes a cyber-physical approach to embedded systems, introducing the engineering concepts underlying embedded systems as a technology and as a subject of study. The focus is on modeling, design, and analysis of cyber-physical systems, which integrate computation, networking, and physical processes. The second edition offers two new chapters, several new exercises, and other improvements. The book can be used as a textbook at the advanced undergraduate or introductory graduate level and as a professional reference for practicing engineers and computer scientists. Readers should have some familiarity with machine structures, computer programming, basic discrete mathematics and algorithms, and signals and systems.