Notes
Applications: speech recognition.
Authors: Beijing Magic Data Technology Co..
Data source: microphone conversation.
LDC number: LDC2019S23.
In Mandarin Chinese
Title from resource home page (LDC website, viewed September 28, 2020).
Summary
"Magic Data Chinese Mandarin Conversational Speech was developed by Beijing Magic Data Technology Co., Ltd. and consists of approximately 10 hours of Mandarin conversational speech from 60 speakers. Each conversation was recorded on multiple devices and is presented in multiple forms, resulting in a total of approximately 60 hours of audio with corresponding transcripts. All participants were native speakers of Mandarin in Mainland China from accent regions across the country. Speakers were paired for conversations on a range of topics, including travel, fitness, games, sports and pets. Speech data was recorded on mobile devices and is presented as 16kHz, 16-bit flac compressed pcm wav. Most files are single channel; however, a stereo version of each conversation is also included. Transcript data is contained in UTF-8 encoded plain text TextGrids. Metadata such as topic, collection date, mobile device and speaker demographic information is found in the documentation accompanying this release." --LDC online catalog.