1. MaryTTS and it's support for languages

1.1. General

MaryTTS is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. ( BTW: Does it has a service page? )

It supports many languages each with it's own set of voices.

Important: A language is NOT the same thing as a voice. To function properly both the language as well as the voice files are required! A lanugage may have several according speeches.

A language file is just an .jar, e.g. "lib/marytts-lang-de-5.1.2.jar".

A voice pack consists out of 1 jar, e.g. "lib/voice-dfki-pavoque-neutral-5.1.jar" and some .mry-files (not sure what they do, yet), but they are required if a voice has them. Not every voice has .mry files. They are stored at "lib/voices/dfki-pavoque-neutral/[...].mry".

 

1.2. Structure

Some functions I guess the directories have:

bin some startup files for the MaryTTS-server, the -client and a -component-installer
doc examples
download temp-dir for downloading
installed config- / info- files for the languages/speeches
lib MaryTTS dependencies, all .jar's for languages and speeches and .mry-files for all speeches
log should be pretty self-explanatory
user-dictionaries dictonaries for some languages (not completly sure about them)

 

1.3. Included Voices (in official MaryTTS distribution

There may be more voices (and languages) available, but these are the voices officially registered in MaryTTS and therefor distributed with it (through an additional download, MaryTTS does this itself using the component-installer).

NOTE: I don't feel responsible for errors in this matrix.

DATE: 06.11.2015

Language Voice-name gender type version description license size depends
DE bits1-hsmm female hsmm 5.1 [TODO] [TODO] 1360761 DE, 5.1
DE bits3 male unit selection 5.1     278237075 DE, 5.1
DE bits3-hsmm male hsmm 5.1     1557124 DE, 5.1
DE dfki-pavoque-neutral male unit selection 5.1     446054145 DE, 5.1
DE dfki-pavoque-neutral-hsmm male hsmm 5.1     2835023 DE, 5.1
DE dfki-pavoque-styles male unit selection 5.1     692113207 DE, 5.1
EN_GB dfki-poppy female unit selection 5.1     99318417 EN-GB, 5.1
EN_GB dfki-poppy-hsmm female hsmm 5.1     1015901 EN-GB, 5.1
EN_GB dfki-prudence female unit selection 5.1     250841190 EN-GB, 5.1
EN_GB dfki-prudence-hsmm female hsmm 5.1     1560473 EN-GB, 5.1
EN_GB dfki-obadiah male unit selection 5.1     146431509 EN-GB, 5.1
EN_GB dfki-obadiah-hsmm male hsmm 5.1     1216409 EN-GB, 5.1
EN_GB dfki-spike male unit selection 5.1     136165028 EN-GB, 5.1
EN_GB dfki-spike-hsmm male hsmm 5.1     1083544 EN-GB, 5.1
EN_US cmu-slt female unit selection 5.1     105909149 EN-US, 5.1
EN_US cmu-bd1-hsmm male hsmm 5.1     1017477 EN-US, 5.1
EN_US cmu-rms-hsmm male hsmm 5.1     1028060 EN-US, 5.1
FR enst-camille female unit selection 5.1     183466604 FR, 5.1
FR enst-camille-hsmm female hsmm 5.1     1518635 FR, 5.1
FR upmc-jessica female unit selection 5.1     126834351 FR, 5.1
FR upmc-jessica-hsmm female hsmm 5.1     1118972 FR, 5.1
FR enst-dennys-hsmm male hsmm 5.1     1676376 FR, 5.1
FR upmc-pierre male unit selection 5.1     171764059 FR, 5.1
FR upmc-pierre-hsmm male hsmm 5.1     1557436 FR, 5.1
IT istc-lucia-hsmm female hsmm 5.1     1466943 IT, 5.1
RU voxforge-ru-nsh male unit selection 5.1     175120753 RU, 5.1
TE cmu-nk female unit selection 5.1     495885808 TE, 5.1
TE cmu-nk-hsmm female hsmm 5.1     3397557 TE, 5.1
TR dfki-ot male unit selection 5.1     157783972 TR, 5.1
TR dfki-ot-hsmm male hsmm 5.1     1366536 TR, 5.1

 

1.4. Licenses

Many voices have their own license, this is a point to keep an eye on!

 

2. MaryTTS in MyRobotLab

To use MaryTTS in MyRobotLab it's dependencies are required (not so surprising). From what I know the server should be sufficient and the client could be skipped.

Further we need the language-files (e.g. "lib/marytts-lang-de-5.1.2.jar") in the classpath.

Then we need the voice file in the classpath.

Also the dirs lib & installed are needed. I think user-dictionaries is needed as well. But about download I'm not sure at all.

The voice can be changed in several ways:

locale: marytts.setLocale(Locale.GERMAN);

voice: marytts.setVoice("dfki-pavoque-neutral");

 

The biggest problem I see is that if we would download all voices, they would be around 5 GB.

 

3.Sources

MaryTTS homepage -> http://mary.dfki.de/

MaryTTS GitHub main repository -> https://github.com/marytts/marytts

GroG

9 years 1 month ago

Excellent Post MaVo !

This explains a lot.  I would recommend that we surface controls, which allow user to select voices & languages he might be interested in during mrl runtime.

As for the service page, yes we'll create one.  But of a higher priority is cleaning up the "Speech" services.

We need to create interfaces to standardize the Big 3  

  • MarySpeech - MaryTTS (our new hero)
  • AcapelaSpeech - nice sounding Google like service 
  • GoogleSpeech - probably still borked with interface/key change

I've also add much more functionality to AduioFile service - it can now play "Tracks" similar to a mixer, potentially it will need some work or refactoring to get the speech services working correctly with it (and caching in the case of Mary)

Great post of information MaVo - this will help us considerably going forward. :D

I think the user dictionaries are used to prompt MaryTTS how to pronounce words that it has problems with, or that differ due to regional variations.

I wonder how much work it would be to modify the voice installer to download the voices a user is interested in and copy them to the correct locations in the MRL distribution.

A function call to support the various effects would be nic to have as well

Volume amount:2.0;
TractScaler amount:1.5;
F0Scale f0Scale:2.0;
F0Add f0Add:50.0;
Rate durScale:1.5;
Robot amount:100.0;
Whisper amount:100.0;
Stadium amount:100.0
Chorus delay1:466;amp1:0.54;delay2:600;amp2:-0.10;delay3:250;amp3:0.30
FIRFilter type:3;fc1:500.0;fc2:2000.0
JetPilot 

Hi,

It's great to have a robot that speak a lot of languages, but if i'm right, sphinx reconize only english in MRL

I know that it's just needed to change the model file, but like for maryTTS , download all models is huge.

So for both, the good way (i think), is write a function that look if the model for the asked language is already added in MRL path . If not, download it .

 

 

The current plan for MaryTTS is to put up a component-installer, there you can select which voice you want to download, ...

At the moment Sphinx isn't used as much as before anymore cause it can not support free-form recognition. The recognition in Chrome will probably become the new "default" speech-recognition.

Greeeting, MaVo