Designing and Evaluating Language Corpora

Designing and Evaluating Language Corpora
Title Designing and Evaluating Language Corpora PDF eBook
Author Jesse Egbert
Publisher Cambridge University Press
Pages 299
Release 2022-04-14
Genre Computers
ISBN 1107151384

Download Designing and Evaluating Language Corpora Book in PDF, Epub and Kindle

This volume introduces a new framework for conceptualizing and achieving corpus representativeness in a rigorous, yet practical way.

Designing and Evaluating Language Corpora

Designing and Evaluating Language Corpora
Title Designing and Evaluating Language Corpora PDF eBook
Author Jesse Egbert
Publisher Cambridge University Press
Pages 299
Release 2022-04-14
Genre Language Arts & Disciplines
ISBN 1009254758

Download Designing and Evaluating Language Corpora Book in PDF, Epub and Kindle

Corpora are ubiquitous in linguistic research, yet to date, there has been no consensus on how to conceptualize corpus representativeness and collect corpus samples. This pioneering book bridges this gap by introducing a conceptual and methodological framework for corpus design and representativeness. Written by experts in the field, it shows how corpora can be designed and built in a way that is both optimally suited to specific research agendas, and adequately representative of the types of language use in question. It considers questions such as 'what types of texts should be included in the corpus?', and 'how many texts are required?' – highlighting that the degree of representativeness rests on the dual pillars of domain considerations and distribution considerations. The authors introduce, explain, and illustrate all aspects of this corpus representativeness framework in a step-by-step fashion, using examples and activities to help readers develop practical skills in corpus design and evaluation.

Developing Linguistic Corpora

Developing Linguistic Corpora
Title Developing Linguistic Corpora PDF eBook
Author Martin Wynne
Publisher Oxbow Books Limited
Pages 100
Release 2005
Genre Language Arts & Disciplines
ISBN

Download Developing Linguistic Corpora Book in PDF, Epub and Kindle

A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.

Analysing Representation

Analysing Representation
Title Analysing Representation PDF eBook
Author Frazer Heritage
Publisher Taylor & Francis
Pages 316
Release 2024-05-31
Genre Language Arts & Disciplines
ISBN 104001898X

Download Analysing Representation Book in PDF, Epub and Kindle

Analysing Representation: A Corpus and Discourse Textbook guides readers through the process of researching how people and phenomena are represented in discourse and introduces them to key tools they can use from corpus linguistics and (critical) discourse analysis. This book takes a step-by-step approach to introducing each concept and includes exercises and further reading to help readers check their progress and prepare for independent research. It is unique in introducing readers to a range of experts representing the full range of work in this area. This book is aimed at final-year undergraduate, taught postgraduate and doctoral level students. It wil also be useful to scholars who are new to combining corpus and discourse methods in investigations of representation.

Doing Linguistics with a Corpus

Doing Linguistics with a Corpus
Title Doing Linguistics with a Corpus PDF eBook
Author Jesse Egbert
Publisher Cambridge University Press
Pages 97
Release 2020-11-12
Genre Language Arts & Disciplines
ISBN 1108897037

Download Doing Linguistics with a Corpus Book in PDF, Epub and Kindle

Paradoxically, doing corpus linguistics is both easier and harder than it has ever been before. On the one hand, it is easier because we have access to more existing corpora, more corpus analysis software tools, and more statistical methods than ever before. On the other hand, reliance on these existing corpora and corpus linguistic methods can potentially create layers of distance between the researcher and the language in a corpus, making it a challenge to do linguistics with a corpus. The goal of this Element is to explore ways for us to improve how we approach linguistic research questions with quantitative corpus data. We introduce and illustrate the major steps in the research process, including how to: select and evaluate corpora, establish linguistically-motivated research questions, observational units and variables, select linguistically interpretable variables, understand and evaluate existing corpus software tools, adopt minimally sufficient statistical methods, and qualitatively interpret quantitative findings.

Multi-Dimensional Analysis

Multi-Dimensional Analysis
Title Multi-Dimensional Analysis PDF eBook
Author Tony Berber Sardinha
Publisher Bloomsbury Publishing
Pages 304
Release 2019-03-21
Genre Language Arts & Disciplines
ISBN 1350023833

Download Multi-Dimensional Analysis Book in PDF, Epub and Kindle

Multi-Dimensional Analysis: Research Methods and Current Issues provides a comprehensive guide both to the statistical methods in Multi-Dimensional Analysis (MDA) and its key elements, such as corpus building, tagging, and tools. The major goal is to explain the steps involved in the method so that readers may better understand this complex research framework and conduct MD research on their own. Multi-Dimensional Analysis is a method that allows the researcher to describe different registers (textual varieties defined by their social use) such as academic settings, regional discourse, social media, movies, and pop songs. Through multivariate statistical techniques, MDA identifies complementary correlation groupings of dozens of variables, including variables which belong both to the grammatical and semantic domains. Such groupings are then associated with situational variables of texts like information density, orality, and narrativity to determine linguistic constructs known as dimensions of variation, which provide a scale for the comparison of a large number of texts and registers. This book is a comprehensive research guide to MDA.

Investigating a Corpus of Historical Oral Testimonies

Investigating a Corpus of Historical Oral Testimonies
Title Investigating a Corpus of Historical Oral Testimonies PDF eBook
Author Chris Fitzgerald
Publisher Taylor & Francis
Pages 206
Release 2022-12-30
Genre Language Arts & Disciplines
ISBN 1000823652

Download Investigating a Corpus of Historical Oral Testimonies Book in PDF, Epub and Kindle

Investigating a Corpus of Historical Oral Testimonies guides the reader through the process of sourcing a relevant oral history archive for linguistic analysis, constructing a representative corpus out of this archive and analysing this using corpus tools. Focusing on the oral history archive at the Irish Bureau of Military History, this book shows how corpus linguistics can illuminate themes worthy of investigation that may otherwise remain hidden. This is exemplified through the investigation of how certainty is constructed in this archive through a number of expressions and which serves as a template for both how oral history can aid linguistic understanding and how corpus linguistics can contribute to oral history investigation. Highlighting why oral history archives are worthy of linguistic analysis and showing what readers can gain from blending linguistic tools and competencies with oral history data, this book is essential reading for all researchers and students working in the areas of corpus linguistics, discourse analysis and oral history.