The Institute
Publications
Projects
CDEL Sound Lab
Resources at Indiana University
Opportunities at AISRI
Related Links
News and Events
Search the AISRI site!
Help

Projects

Projects | Research | Editorial | Educational

The Annotated Text Processor (ATP)

Back ATP Media Player Next


(click to enlarge)

ATP has a media player interface that allows the user to play sound and video files and to link them directly with related text documents. The underlying media player control supports WAV, MPG, AVI, and MOV formats.

The "Select" button allows the user to browse and open the desired media file while the "Close" button flushes the media buffer and closes the file. The navigation bar sits just below the path name and provides "Play," "Pause," and "Stop" buttons. The length of the media file—in this case a WAV sound file—is reported in the "Length" box when it is opened. All durations and positions are reported in milliseconds (thousandths of a second). Thus the sound file in the example is 457.643 seconds (00:07:37.643 h:m:s) in length. In practice, the media player has been tested successfully with sound files of up to an hour in length. Whenever the user clicks the Play, Pause, or Stop button, the current position is reported in the "Position" box. (Both the Length and Position boxes are read only.)

While the sound is playing the linguist can click the "Mark" button to learn the position of a segment boundary or interesting phonetic feature without stopping or pausing the sound. The "Play on Mark" button will play a predefined segment of the sound file—three seconds in this case—centered on the Mark itself. If the Mark does record a segment boundary, the linguist can use other buttons can transfer the Mark to the Start and End boxes to define the segment exactly. The Play button always plays from the Start position to the End position. (Initially Start is zero and End equals Length.) Start and End can also be incremented or decremented by predefined amounts, or changed by hand. The Start and End positions of each segment can be saved with the text data and reloaded from it as long as the data model provides storage locations. This is how the linguist actively links sound and text data.

The media player serves purposes at three stages of linguistic analysis:

  1. The general goal is to produce a transcription of the audio or video recording but at the earliest stage the linguist must discover how to divide the untranscribed recording into phrases, sentences, and other appropriate segments. The linguist can play and replay the media resource and explore it at will, and decide on tentative segment positions, and create a very tentative text structure with minimal contents, all of which can be easily revised on closer examination.
  2. Once the recording has been marked initially, the detailed transcription can proceed in earnest. The linguist listens carefully to each marked segment and fills in and completes the text, adjusting and finalizing the associated media-file positions as he or she proceeds. On completion, the linguist has both an accurate text transcription and a fully analyzed sound or video resource which are linked to one another in detail.
  3. Once the transciption of this text is finished, and other texts are also transcibed and linked to recordings, the media player controls facilitate detailed and extensive searches of texts and recordings in support of further comparative or phonetic work.

In principle, the media player can be extended to directly handle DAT, cassette player, and CD audio, or any device that supplies a Media Control Interface (MCI) driver. Such an extensions would only require the addition of tools to handle tracks and relevant hardware switches.

Back to Annotated Text Processor

©2001, 2002, 2003, The Trustees of Indiana University