Linguistics | Seminar in Computational Linguistics
L715 | 27567 | Kuebler

Seminar in Computational Linguistics: CL Resources for Less Commonly
Taught Languages

3 Credits

Research in Computational Linguistics (CL) in the last two decades has
been hampered by the data-sparseness bottleneck. The majority of high-
accuracy applications in CL are data-driven and thus require large
amounts of annotated data for training.  Such annotated corpora are
available for a small number of major languages such as English,
German, and French. in this course, we will investigate methods for
producing such resources for other languages. One possible strategy is
to adapt resources from a closely related language. Another is to use
corss-language projection, via a parallel corpus and resources for the
source language to induce a transfer to the target language.

One goal of the seminar is to introduce students to the state of the
art in this research area. Another goal is for students to acquire the
skills necessary to write a successful research paper. To practice the
latter, we will analyze research paper not only for their content but
also for their structure, and we will practice writing parts of papers
in class.

Optional textbook:

A. Feldman, J. Hana (2010)  A Resource-light Approach to Morpho-
syntactic Tagging. Rodopi.