Harvesting Speech Datasets for Linguistic Research on the Web
Awardees: Mats Rooth, CornellUniversity, NSF; Michael Wagner, McGillUniversity, SSHRC.
Description: This project will harvest audio and transcribed data from podcasts, news broadcasts, public and educational lectures and other sources to create a massive corpus of speech. Tools will then be developed to analyze the different uses of prosody (rhythm, stress and intonation) within spoken communication.
Official Website Unavailable
Related Websites:
Article in the Cornell Cronicle
Prosody Datasets Website: Examples