BUT-CZAS – Database of czech anechoic speech

Authors
Vojtech Hajek
Pavol Harar *administrátor
Jiri Schimmel
Radim Burget

Description
BUT-CZAS (Brno University of Technology, Czech Anechoic Speech) is a database of recordings of a human voice, recorded in an anechoic chamber of Brno University of Technology in Brno, Czech republic. The database consists of 405 mono recordings (bit depth: 24 bit, sampling rate: 48 kHz) of Czech speech made by 18 speakers (9 women, 9 men) reading a script. All speakers were between 16 to 76 years old. Total duration of all recordings is 315 minutes. There is more than 40 000 recordings of 1 711 unique words. During the recording, particular emphasis was placed on the recording environment, quality of the recordings and equality of both age and sex groups. Each script is available in the form of plain text.

How to cite?
Please cite the following paper:
BUT-CZAS Korpus kvalitních nahrávek české řeči pořízených v bezodrazové komoře

Bibtex:

@article
{hajek2018butczas,
title={BUT-CZAS: Korpus kvalitních nahrávek české řeči pořízených v bezodrazové komoře},
author={Hajek, Vojtech and Harar, Pavol and Schimmel, Jiri and Burget, Radim},
journal={Elektrorevue},
pages={48--52},
year={2018},
publisher={International Society for Science and Engineering, o.s.}}

License & Download
The BUT-CZAS database is available for download for free as a zip archive. You are allowed to use the database for scientific and artistic purposes only; other usage is prohibited. By downloading this database you agree to the license terms and conditions. The download link will be available after the completion of the archive.

Extending the database
If you own similar recordings and you wish to extend this database, please write to the administrator of this database or to someone listed in the contacts. Before that, please make sure the recordings fulfill these requirements:

  • Recordings were made in an anechoic chamber or in an environment that that is described in ISO 3745:2012.
  • Recordings were made using same or similar equipment as the recordings in the original database.
  • Recordings are saved in .wav format with sampling rate of at least 48 kHz and bit depth of 24 bit.

More technical details are to be found in the article mentioned above.

References to the text authors (ISO 690)
* order of references as in the article

[1] ADAMS, Douglas. Stopařův průvodce po Galaxii. Přeložil Jana HOLLANOVÁ. Praha: Hynek, 1998. Fascinace. ISBN 80-86202-14-3.
[2] HEMINGWAY, Ernest. Stařec a moře. Přeložil Šimon PELLAR. Praha: Odeon, 2015. ISBN 978-80-207-1621-7.
[4] JIROTKA, Zdeněk. Saturnin. Vyd. 19., V nakl. Šulc – Švarc 6. Praha: Šulc – Švarc, 2005. ISBN 80-7244-169-8.
[5] ORWELL, George. Farma zvířat: pohádkový příběh. Praha: Aurora, 2000. ISBN 80-7299-021-7.
[6] SAINT-EXUPÉRY, Antoine de. Malý princ. 9. vyd. v Albatrosu. Přeložil Zdeňka STAVINOHOVÁ. Praha: Albatros, 1998. ISBN 80-00-00586-7.
[7] STEINBECK, John. O myších a lidech. Praha: Československý spisovatel, 1960. Edice ilustrovaných novel.
[8] ŠEDIVÝ, Petr.: Anglie a Skotsko se vzepřely vrchnosti, za vlčí máky mohou dostat trest. iDNES.cz, 2016. URL https://fotbal.idnes.cz/fot_reprez.aspx? c=A161112_010559_fot_reprez_pes