Compiling and Annotating a Learner Corpus for a Morphologically Rich Language : CzeSL, a corpus of non-native Czech

Advanced Search

Basic Search

Help

AND OR NOT

Add a row

Reset

Limit results by

Books+ Search Results

Author

Rosen, Alexandr, author.

Title

Compiling and Annotating a Learner Corpus for a Morphologically Rich Language : CzeSL, a corpus of non-native Czech / Alexandr Rosen, Jiří Hana, Barbora Hladká, Tomáš Jelínek, Svatava Škodová, Barbora Štindlová.

ISBN

9788024647654

8024647656

9788024647593

8024647591

Published

Prague : Karolinum Press, 2020.

Physical Description

281 pages : illustrations (some color) ; 23 cm

Format

Books

Language

English

Added to Catalog

April 20, 2021

Contents

Cover

Contents

List of abbreviations

Introduction

About this book

Reasons to study non-native Czech

Some properties of non-native Czech

Morphology

Syntax

Word segmentation

Learner corpus

Roadmap

Learner corpora

Terminology

Various types of learner corpora

The choice of texts

Annotation

Textual annotation

Linguistic annotation

Error annotation

correction

Error annotation

categorization

Annotation scheme

Data access

Some learner corpora

ASK

CLC

COPLE2

CroLTeC

Falko

ICLE

MERLIN

RLC

SweLL

Relationships of CzeSL with other learner corpora

Introducing the CzeSL project

Specifications of CzeSL

Intended usage

AKCES

the umbrella project

Procurement of texts

Text collection

Transcription

Anonymization

Metadata

Error annotation

Errors and learner language

More than one way to annotate errors in CzeSL

A wishlist for error annotation

Interference and other types of explanation

Interpretation in terms of TH

Word order

Style

Communication goal

The two-tier annotation scheme

Annotation scheme as a compromise

Why multiple tiers

How many tiers

Multiple tiers in a tabular format

Content of the tiers

A sample text with T1 vs. T2 corrections

Links between tiers

Error tags

Morphosyntactic references

Follow-up corrections

Alternative target hypotheses

Error tagset

Based on linguistic categories

Grammar-based vs. formal errors

Extent of the annotated unit

Grammar-based tags

Errors at T1

Errors at T2

Coarse-grained

An example of complex annotation

Evaluation of the manual tiered error annotation

Inter-annotator agreement (IAA)

A pilot annotation

IAA on all doubly-annotated texts

Error tags depend on target hypothesis

Possible causes of the annotators' disagreements

Formal tags

Automatic extension and modification of error annotation

Automatic detection of formal errors on T1

Formal orthographic errors

Formal errors sometimes influencing pronunciation

Formal errors influencing pronunciation

Other types of errors

Automatic classification of word-boundary errors

Implicit error annotation

Multi-dimensional error annotation (MD)

Focus on morphology

All annotation applied to the source text

Extent of the annotated unit

Alternative error domains

Source text, target hypothesis, annotated strings

Domains and features

Linguistic annotation

Annotation with tools for Standard Czech

Annotation of target hypothesis

Annotation of T1

Annotation of source texts

Annotation of interlanguage in UD

Tokenization

Part-of-speech and morphology

Lemmata

Syntactic Structure

Evaluation

Annotation process

Overview of the annotation process

Transcription and anonymization of manuscripts

Tiered error annotation

Manual error annotation

Subjects

Corpora (Linguistics)

Czech language.

Corpora (Linguistics)

Czech language.

Also listed under

Hana, Jiří, author.

Vidová-Hladká, Barbora, 1971- author.

Jelínek, Tomáš, author.

Boyon Škodová, Svatava, 1974- author.

Štindlová, Barbora, author.

Bookmark As

https://search.library.yale.edu/catalog/15777295

Citation

Cite

Available from:

Loading holdings.

Unable to load. Retry?

Loading holdings...

Unable to load. Retry?

More info at Google books