[Philadelphia, PA] : Linguistic Data Consortium, 2018.
Physical Description
1 DVD-ROM ; 4 3/4 in.
Notes
Title from disc label.
"LDC2018S11."
Data type(s): Sound, text.
Data source(s): Broadcast conversation.
Application(s): Speech recognition.
Authors: Carlos Daniel Hernández Mena.
In Spanish.
Access and use
Access restricted by licensing agreement.
Summary
IEMPIESS (Corpus de Investigación en Español de México del Posgrado de Ingeniería Eléctrica y Servicio Social) Balance was developed by the Development of Speech Technologies program at the School of Engineering at the National Autonomous University of Mexico (UNAM) and consists of approximately 18 hours of Mexican Spanish broadcast speech with associated transcripts. The goal of this work was to create acoustic models for automatic speech recognition. For more information and documentation see the CIEMPIESS-UNAM Project website. CIEMPIESS Balance is a companion corpus to CIEMPIESS Light, released by LDC as LDC2017S23. It was developed so that the data sets together constitute a gender-balanced corpus. The gender breakdown in CIEMPIESS Light is approximately 75% male and 25% female. In CIEMPIESS Balance the gender breakdown is approximately 25% male and 75% female." ---LDC online catalog.
Variant and related titles
Corpus de Investigación en Espanol de México del Posgrado de Ingeniería Eléctrica y Servicio Social