🎧 F5-TTS Emotional Speech Demo

Each row shows a generated version of the same phrase conditioned on a specific emotion.

All audios below are obtained by using inference on texts and reference audios unseen during training (from validation dataset).

Reference Audio Generated Audio (Intensity = 10) Emotion Text
Neutral Her shoes were like fishes.
Happy
Sad
Angry
Surprise
Reference Audio Generated Audio (Intensity = 10) Emotion Text
Neutral They went up to the dark mass job had pointed out.
Happy
Sad
Angry
Surprise
Reference Audio Generated Audio (Intensity = 10) Emotion Text
Neutral Clear then clear water.
Happy
Sad
Angry
Surprise
Reference Audio Generated Audio (Intensity = 10) Emotion Text
Neutral Mister chairman, I move for a division.
Happy
Sad
Angry
Surprise
Reference Audio Generated Audio (Intensity = 10) Emotion Text
Neutral She is now choosing skirt to wear.
Happy
Sad
Angry
Surprise
Reference Audio Generated Audio (Intensity = 10) Emotion Text
Neutral All smiles were real and the happier the more sincere.
Happy
Sad
Angry
Surprise
Reference Audio Generated Audio (Intensity = 10) Emotion Text
Neutral Monster made a deep bau.
Happy
Sad
Angry
Surprise