• emotionrevealed > speaker-independent
  • speaker-independent

    免费下载 下载该文档 文档格式:PDF   更新时间:2007-10-02   下载次数:0   点击次数:1
    文档基本属性
    文档语言:
    文档格式:pdf
    文档作者:dx8gfhcmmr6m3k42bq2qrjqyt
    关键词:
    主题:
    备注:
    点击这里显示更多文档属性
    MASC: A Speech Corpus in Mandarin for Emotion Analysis and Affective Speaker Recognition
    Tian Wu, Yingchun Yang, Zhaohui Wu and Dongdong Li CCNT Lab, College of Computer Science and Technology Zhejiang University, Hangzhou, P.R.CHINA
    {wutian, yyc, wzh, lidd}@zju.edu.cn
    Abstract
    In this paper, a large emotional speech database MASC (Mandarin Affective Speech Corpus) is introduced. The database contains recordings of 68 native speakers (23 female and 45 male) and ve kinds of emotional states: neutral, anger, elation, panic and sadness. Each speaker pronounces 5 phrases, 10 sentences for three times for each emotional states and 2 paragraphs only for neutral. These materials covers all the phonemes in Chinese. This corpus is constructed for prosodic and linguistic investigation of emotion expression in Mandarin. It can also be used for recognition of affectively stressed speakers. Furthermore, prosodic feature analysis and speaker recognition baseline experiment are performed on this database.
    1. Introduction
    Ways of expressing emotions by human and the effect on speech of emotional state changes to speakers have intrigued researchers for a long time. Currently, psychologists have done many experiments and raised a variety of theories [1]. However, collecting large scale affective speech corpus is a very difcult task. Few works are done here. Emotional Prosody Speech and Transcripts (EPST) is an emotional speech database provided by Linguistic Data Consortium (LDC) [2]. This corpus covers 14 emotional states based on Banse & Scherer's selection criteria [3] and is designed to support research in emotional prosody. For speaker-independent emotion recognition, Sony entertainment AIBO is a target scenario to which emotional databases are recorded. These databases simulate different possible situations and comprise all the desired emotions [4]. RUSLANA is a database of emotional utterances and recorded in Russian, aiming for linguistic and speech processing research on communicative and emotive-attitudinal aspects of spoken language [5]. Sixty-one native speakers of standard Russian were recorded for this database. As mentioned above, academic and applied research activities are stimulated in the area of emotion recognition and analysis. By far, there is still not a large speech database used for affectively speaker recognition. Our motivation of creating an emotional speech corpus arises from the mismatch in automatic speaker recognition. Current speaker verication and identication systems are limited by the effect on speech of transient state changes to speakers. The variability of intra-speaker can cause unacceptably high error rates [6]. Furthermore, in the emotional speech investigation area, the focus has so far been on some major languages as English, German, French and Russian. Very little is known about the vocal correlates of emotion in continuous spoken Mandarin. Our goal is to provide a large corpora in Chinese designed

    下一页

  • 下载地址 (推荐使用迅雷下载地址,速度快,支持断点续传)
  • 免费下载 PDF格式下载
  • 您可能感兴趣的
  • revealed  revealed是什么意思  revenues  revealing  concealed  reveals  revefrance  revealtrans  revedemiel  nuxerevedemiel