Machine Translation with Minimal Reliance on Parallel Resources

Machine Translation with Minimal Reliance on Parallel Resources
Title Machine Translation with Minimal Reliance on Parallel Resources PDF eBook
Author George Tambouratzis
Publisher Springer
Pages 92
Release 2017-08-09
Genre Computers
ISBN 3319631071

Download Machine Translation with Minimal Reliance on Parallel Resources Book in PDF, Epub and Kindle

This book provides a unified view on a new methodology for Machine Translation (MT). This methodology extracts information from widely available resources (extensive monolingual corpora) while only assuming the existence of a very limited parallel corpus, thus having a unique starting point to Statistical Machine Translation (SMT). In this book, a detailed presentation of the methodology principles and system architecture is followed by a series of experiments, where the proposed system is compared to other MT systems using a set of established metrics including BLEU, NIST, Meteor and TER. Additionally, a free-to-use code is available, that allows the creation of new MT systems. The volume is addressed to both language professionals and researchers. Prerequisites for the readers are very limited and include a basic understanding of the machine translation as well as of the basic tools of natural language processing.​

Using Comparable Corpora for Under-Resourced Areas of Machine Translation

Using Comparable Corpora for Under-Resourced Areas of Machine Translation
Title Using Comparable Corpora for Under-Resourced Areas of Machine Translation PDF eBook
Author Inguna Skadiņa
Publisher Springer
Pages 323
Release 2019-02-06
Genre Computers
ISBN 3319990047

Download Using Comparable Corpora for Under-Resourced Areas of Machine Translation Book in PDF, Epub and Kindle

This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.

Machine Translation and Transliteration involving Related, Low-resource Languages

Machine Translation and Transliteration involving Related, Low-resource Languages
Title Machine Translation and Transliteration involving Related, Low-resource Languages PDF eBook
Author Anoop Kunchukuttan
Publisher CRC Press
Pages 215
Release 2021-09-08
Genre Computers
ISBN 1000422410

Download Machine Translation and Transliteration involving Related, Low-resource Languages Book in PDF, Epub and Kindle

Machine Translation and Transliteration involving Related, Low-resource Languages discusses an important aspect of natural language processing that has received lesser attention: translation and transliteration involving related languages in a low-resource setting. This is a very relevant real-world scenario for people living in neighbouring states/provinces/countries who speak similar languages and need to communicate with each other, but training data to build supporting MT systems is limited. The book discusses different characteristics of related languages with rich examples and draws connections between two problems: translation for related languages and transliteration. It shows how linguistic similarities can be utilized to learn MT systems for related languages with limited data. It comprehensively discusses the use of subword-level models and multilinguality to utilize these linguistic similarities. The second part of the book explores methods for machine transliteration involving related languages based on multilingual and unsupervised approaches. Through extensive experiments over a wide variety of languages, the efficacy of these methods is established. Features Novel methods for machine translation and transliteration between related languages, supported with experiments on a wide variety of languages. An overview of past literature on machine translation for related languages. A case study about machine translation for related languages between 10 major languages from India, which is one of the most linguistically diverse country in the world. The book presents important concepts and methods for machine translation involving related languages. In general, it serves as a good reference to NLP for related languages. It is intended for students, researchers and professionals interested in Machine Translation, Translation Studies, Multilingual Computing Machine and Natural Language Processing. It can be used as reference reading for courses in NLP and machine translation. Anoop Kunchukuttan is a Senior Applied Researcher at Microsoft India. His research spans various areas on multilingual and low-resource NLP. Pushpak Bhattacharyya is a Professor at the Department of Computer Science, IIT Bombay. His research areas are Natural Language Processing, Machine Learning and AI (NLP-ML-AI). Prof. Bhattacharyya has published more than 350 research papers in various areas of NLP.

Handbook of Natural Language Processing and Machine Translation

Handbook of Natural Language Processing and Machine Translation
Title Handbook of Natural Language Processing and Machine Translation PDF eBook
Author Joseph Olive
Publisher Springer Science & Business Media
Pages 956
Release 2011-03-02
Genre Computers
ISBN 1441977139

Download Handbook of Natural Language Processing and Machine Translation Book in PDF, Epub and Kindle

This comprehensive handbook, written by leading experts in the field, details the groundbreaking research conducted under the breakthrough GALE program--The Global Autonomous Language Exploitation within the Defense Advanced Research Projects Agency (DARPA), while placing it in the context of previous research in the fields of natural language and signal processing, artificial intelligence and machine translation. The most fundamental contrast between GALE and its predecessor programs was its holistic integration of previously separate or sequential processes. In earlier language research programs, each of the individual processes was performed separately and sequentially: speech recognition, language recognition, transcription, translation, and content summarization. The GALE program employed a distinctly new approach by executing these processes simultaneously. Speech and language recognition algorithms now aid translation and transcription processes and vice versa. This combination of previously distinct processes has produced significant research and performance breakthroughs and has fundamentally changed the natural language processing and machine translation fields. This comprehensive handbook provides an exhaustive exploration into these latest technologies in natural language, speech and signal processing, and machine translation, providing researchers, practitioners and students with an authoritative reference on the topic.

Advances in Information Retrieval

Advances in Information Retrieval
Title Advances in Information Retrieval PDF eBook
Author Pavel Serdyukov
Publisher Springer
Pages 919
Release 2013-03-12
Genre Computers
ISBN 3642369731

Download Advances in Information Retrieval Book in PDF, Epub and Kindle

This book constitutes the proceedings of the 35th European Conference on IR Research, ECIR 2013, held in Moscow, Russia, in March 2013. The 55 full papers, 38 poster papers and 10 demonstrations presented in this volume were carefully reviewed and selected from 287 submissions. The papers are organized in the following topical sections: user aspects; multimedia and cross-media IR; data mining; IR theory and formal models; IR system architectures; classification; Web; event detection; temporal IR, and microblog search. Also included are 4 tutorial and 2 workshop presentations.

Natural Language Processing and Chinese Computing

Natural Language Processing and Chinese Computing
Title Natural Language Processing and Chinese Computing PDF eBook
Author Xiaodan Zhu
Publisher Springer Nature
Pages 873
Release 2020-10-05
Genre Computers
ISBN 3030604500

Download Natural Language Processing and Chinese Computing Book in PDF, Epub and Kindle

This two-volume set of LNAI 12340 and LNAI 12341 constitutes the refereed proceedings of the 9th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2020, held in Zhengzhou, China, in October 2020. The 70 full papers, 30 poster papers and 14 workshop papers presented were carefully reviewed and selected from 320 submissions. They are organized in the following areas: Conversational Bot/QA; Fundamentals of NLP; Knowledge Base, Graphs and Semantic Web; Machine Learning for NLP; Machine Translation and Multilinguality; NLP Applications; Social Media and Network; Text Mining; and Trending Topics.

Machine Translation and the Information Soup

Machine Translation and the Information Soup
Title Machine Translation and the Information Soup PDF eBook
Author David Farwell
Publisher Springer
Pages 551
Release 2003-06-29
Genre Computers
ISBN 3540494782

Download Machine Translation and the Information Soup Book in PDF, Epub and Kindle

Machine Translation and the Information Soup! Over the past fty years, machine translation has grown from a tantalizing dream to a respectable and stable scienti c-linguistic enterprise, with users, c- mercial systems, university research, and government participation. But until very recently, MT has been performed as a relatively distinct operation, so- what isolated from other text processing. Today, this situation is changing rapidly. The explosive growth of the Web has brought multilingual text into the reach of nearly everyone with a computer. We live in a soup of information, an increasingly multilingual bouillabaisse. And to partake of this soup, we can use MT systems together with more and more tools and language processing technologies|information retrieval engines, - tomated text summarizers, and multimodal and multilingual displays. Though some of them may still be rather experimental, and though they may not quite t together well yet, it is clear that the future will o er text manipulation systems that contain all these functions, seamlessly interconnected in various ways.