Name: Multi-speaker AIOI dataset
The AIOI dataset consists of 60 spoken sentences combined 5 words of 5 Japanese vowels, such as {aioi, aue, ao, ie, uo}.
By connecting the words, 30 sentences that included all possible two-word sentences, e.g., “aioi ao,” “aue aue,” and “ie aioi,”
and 5 three-word sentences, such as “ie ie uo,” “uo aue ie,” “ao ie ao,” “aue ao ie,” and “aioi uo ie” are prepared.
Each sentence is spoken twice by a native Japanese speaker and recorded in the dataset.
The number of the speaker is 4, that includes 2 male speakers (speaker_H and speaker_N) and 2 female speakers (speaker_K and speaker_M).
Each directory including the raw speech data (.wav), phoneme labels (.lab), and word labels (.lab2) of each speaker.
- DATA/ (Raw speech data)
- PHONELABEL/ (Phoneme labels)
- WORDLABEL/ (Word labels)