Automated Grammatical Error Detection for Language Learners
Title | Automated Grammatical Error Detection for Language Learners PDF eBook |
Author | Claudia Leacock |
Publisher | Morgan & Claypool Publishers |
Pages | 172 |
Release | 2014-02-01 |
Genre | Computers |
ISBN | 1627050140 |
It has been estimated that over a billion people are using or learning English as a second or foreign language, and the numbers are growing not only for English but for other languages as well. These language learners provide a burgeoning market for tools that help identify and correct learners' writing errors. Unfortunately, the errors targeted by typical commercial proofreading tools do not include those aspects of a second language that are hardest to learn. This volume describes the types of constructions English language learners find most difficult: constructions containing prepositions, articles, and collocations. It provides an overview of the automated approaches that have been developed to identify and correct these and other classes of learner errors in a number of languages. Error annotation and system evaluation are particularly important topics in grammatical error detection because there are no commonly accepted standards. Chapters in the book describe the options available to researchers, recommend best practices for reporting results, and present annotation and evaluation schemes. The final chapters explore recent innovative work that opens new directions for research. It is the authors' hope that this volume will continue to contribute to the growing interest in grammatical error detection by encouraging researchers to take a closer look at the field and its many challenging problems.
Automated Grammatical Error Detection for Language Learners, Second Edition
Title | Automated Grammatical Error Detection for Language Learners, Second Edition PDF eBook |
Author | Claudia Leacock |
Publisher | Springer Nature |
Pages | 154 |
Release | 2022-06-01 |
Genre | Computers |
ISBN | 3031021533 |
It has been estimated that over a billion people are using or learning English as a second or foreign language, and the numbers are growing not only for English but for other languages as well. These language learners provide a burgeoning market for tools that help identify and correct learners' writing errors. Unfortunately, the errors targeted by typical commercial proofreading tools do not include those aspects of a second language that are hardest to learn. This volume describes the types of constructions English language learners find most difficult: constructions containing prepositions, articles, and collocations. It provides an overview of the automated approaches that have been developed to identify and correct these and other classes of learner errors in a number of languages. Error annotation and system evaluation are particularly important topics in grammatical error detection because there are no commonly accepted standards. Chapters in the book describe the options available to researchers, recommend best practices for reporting results, and present annotation and evaluation schemes. The final chapters explore recent innovative work that opens new directions for research. It is the authors' hope that this volume will continue to contribute to the growing interest in grammatical error detection by encouraging researchers to take a closer look at the field and its many challenging problems.
Automated Grammatical Error Detection for Language Learners
Title | Automated Grammatical Error Detection for Language Learners PDF eBook |
Author | Claudia Leacock |
Publisher | Springer Nature |
Pages | 127 |
Release | 2010-05-11 |
Genre | Computers |
ISBN | 3031021371 |
It has been estimated that over a billion people are using or learning English as a second or foreign language, and the numbers are growing not only for English but for other languages as well. These language learners provide a burgeoning market for tools that help identify and correct learners' writing errors. Unfortunately, the errors targeted by typical commercial proofreading tools do not include those aspects of a second language that are hardest to learn. This volume describes the types of constructions English language learners find most difficult -- constructions containing prepositions, articles, and collocations. It provides an overview of the automated approaches that have been developed to identify and correct these and other classes of learner errors in a number of languages. Error annotation and system evaluation are particularly important topics in grammatical error detection because there are no commonly accepted standards. Chapters in the book describe the options available to researchers, recommend best practices for reporting results, and present annotation and evaluation schemes. The final chapters explore recent innovative work that opens new directions for research. It is the authors' hope that this volume will contribute to the growing interest in grammatical error detection by encouraging researchers to take a closer look at the field and its many challenging problems. Table of Contents: Introduction / History of Automated Grammatical Error Detection / Special Problems of Language Learners / Language Learner Data / Evaluating Error Detection Systems / Article and Preposition Errors / Collocation Errors / Different Approaches for Different Errors / Annotating Learner Errors / New Directions / Conclusion
Automatic Treatment and Analysis of Learner Corpus Data
Title | Automatic Treatment and Analysis of Learner Corpus Data PDF eBook |
Author | Ana Díaz-Negrillo |
Publisher | John Benjamins Publishing Company |
Pages | 322 |
Release | 2013-12-15 |
Genre | Language Arts & Disciplines |
ISBN | 9027270953 |
This book is a critical appraisal of recent developments in corpus linguistics for the analysis of written and spoken learner data. The twelve papers cover an introductory critical appraisal of learner corpus data compilation and development (section 1); issues in data compilation, annotation and exchangeability (section 2); automatic approaches to data identification and analysis (section 3); and analysis of learner corpus data in the light of recent models of data analysis and interpretation, especially recent automatic approaches for the identification of learner language features (section 4). This collection is aimed at students and researchers of corpus linguistics, second language acquisition studies and quantitative linguistics. It will significantly advance learner corpus research in terms of methodological innovation and will fill in an important gap in the development of multidisciplinary approaches (for learner corpus studies).
Natural Language Processing for Social Media, Second Edition
Title | Natural Language Processing for Social Media, Second Edition PDF eBook |
Author | Atefeh Farzindar |
Publisher | Springer Nature |
Pages | 188 |
Release | 2017-12-15 |
Genre | Computers |
ISBN | 3031021673 |
In recent years, online social networking has revolutionized interpersonal communication. The newer research on language analysis in social media has been increasingly focusing on the latter's impact on our daily lives, both on a personal and a professional level. Natural language processing (NLP) is one of the most promising avenues for social media data processing. It is a scientific challenge to develop powerful methods and algorithms which extract relevant information from a large volume of data coming from multiple sources and languages in various formats or in free form. We discuss the challenges in analyzing social media texts in contrast with traditional documents. Research methods in information extraction, automatic categorization and clustering, automatic summarization and indexing, and statistical machine translation need to be adapted to a new kind of data. This book reviews the current research on NLP tools and methods for processing the non-traditional information from social media data that is available in large amounts (big data), and shows how innovative NLP approaches can integrate appropriate linguistic information in various fields such as social media monitoring, healthcare, business intelligence, industry, marketing, and security and defence. We review the existing evaluation metrics for NLP and social media applications, and the new efforts in evaluation campaigns or shared tasks on new datasets collected from social media. Such tasks are organized by the Association for Computational Linguistics (such as SemEval tasks) or by the National Institute of Standards and Technology via the Text REtrieval Conference (TREC) and the Text Analysis Conference (TAC). In the concluding chapter, we discuss the importance of this dynamic discipline and its great potential for NLP in the coming decade, in the context of changes in mobile technology, cloud computing, virtual reality, and social networking. In this second edition, we have added information about recent progress in the tasks and applications presented in the first edition. We discuss new methods and their results. The number of research projects and publications that use social media data is constantly increasing due to continuously growing amounts of social media data and the need to automatically process them. We have added 85 new references to the more than 300 references from the first edition. Besides updating each section, we have added a new application (digital marketing) to the section on media monitoring and we have augmented the section on healthcare applications with an extended discussion of recent research on detecting signs of mental illness from social media.
Bayesian Analysis in Natural Language Processing, Second Edition
Title | Bayesian Analysis in Natural Language Processing, Second Edition PDF eBook |
Author | Shay Cohen |
Publisher | Springer Nature |
Pages | 311 |
Release | 2022-05-31 |
Genre | Computers |
ISBN | 3031021703 |
Natural language processing (NLP) went through a profound transformation in the mid-1980s when it shifted to make heavy use of corpora and data-driven techniques to analyze language. Since then, the use of statistical techniques in NLP has evolved in several ways. One such example of evolution took place in the late 1990s or early 2000s, when full-fledged Bayesian machinery was introduced to NLP. This Bayesian approach to NLP has come to accommodate various shortcomings in the frequentist approach and to enrich it, especially in the unsupervised setting, where statistical learning is done without target prediction examples. In this book, we cover the methods and algorithms that are needed to fluently read Bayesian learning papers in NLP and to do research in the area. These methods and algorithms are partially borrowed from both machine learning and statistics and are partially developed "in-house" in NLP. We cover inference techniques such as Markov chain Monte Carlo sampling and variational inference, Bayesian estimation, and nonparametric modeling. In response to rapid changes in the field, this second edition of the book includes a new chapter on representation learning and neural networks in the Bayesian context. We also cover fundamental concepts in Bayesian statistics such as prior distributions, conjugacy, and generative modeling. Finally, we review some of the fundamental modeling techniques in NLP, such as grammar modeling, neural networks and representation learning, and their use with Bayesian analysis.
Learning to Rank for Information Retrieval and Natural Language Processing, Second Edition
Title | Learning to Rank for Information Retrieval and Natural Language Processing, Second Edition PDF eBook |
Author | Hang Li |
Publisher | Springer Nature |
Pages | 107 |
Release | 2022-05-31 |
Genre | Computers |
ISBN | 303102155X |
Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank is useful for many applications in information retrieval, natural language processing, and data mining. Intensive studies have been conducted on its problems recently, and significant progress has been made. This lecture gives an introduction to the area including the fundamental problems, major approaches, theories, applications, and future work. The author begins by showing that various ranking problems in information retrieval and natural language processing can be formalized as two basic ranking tasks, namely ranking creation (or simply ranking) and ranking aggregation. In ranking creation, given a request, one wants to generate a ranking list of offerings based on the features derived from the request and the offerings. In ranking aggregation, given a request, as well as a number of ranking lists of offerings, one wants to generate a new ranking list of the offerings. Ranking creation (or ranking) is the major problem in learning to rank. It is usually formalized as a supervised learning task. The author gives detailed explanations on learning for ranking creation and ranking aggregation, including training and testing, evaluation, feature creation, and major approaches. Many methods have been proposed for ranking creation. The methods can be categorized as the pointwise, pairwise, and listwise approaches according to the loss functions they employ. They can also be categorized according to the techniques they employ, such as the SVM based, Boosting based, and Neural Network based approaches. The author also introduces some popular learning to rank methods in details. These include: PRank, OC SVM, McRank, Ranking SVM, IR SVM, GBRank, RankNet, ListNet & ListMLE, AdaRank, SVM MAP, SoftRank, LambdaRank, LambdaMART, Borda Count, Markov Chain, and CRanking. The author explains several example applications of learning to rank including web search, collaborative filtering, definition search, keyphrase extraction, query dependent summarization, and re-ranking in machine translation. A formulation of learning for ranking creation is given in the statistical learning framework. Ongoing and future research directions for learning to rank are also discussed. Table of Contents: Learning to Rank / Learning for Ranking Creation / Learning for Ranking Aggregation / Methods of Learning to Rank / Applications of Learning to Rank / Theory of Learning to Rank / Ongoing and Future Work