Suggestions for Text-To-Speech (TTS) from Org sources?

emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed

* Suggestions for Text-To-Speech (TTS) from Org sources?
@ 2023-09-09 18:05 Jens Lechtenboerger
  2023-09-09 21:20 ` briangpowell
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Jens Lechtenboerger @ 2023-09-09 18:05 UTC (permalink / raw)
  To: emacs-orgmode

Dear all,

does someone here produce audio via Text-To-Speech (TTS) from Org
sources?  I plan to do that in the context of emacs-reveal to
generate voice-over for reveal.js presentations, with open questions
[1] concerning my initial, experimental approach.

Currently, I like the default model of Coqui-AI TTS [2] and
Microsoft SpeechT5 [3] best.  Any suggestions for free and open TTS
implementations that produce even better results?  Other models of
Coqui-AI?  The solution should work without GPU support, which seems
to rule out Suno Bark [4].

The above models do not pronounce numbers/digits, and they fail to
pronounce most acronyms.  In a preprocessing step I could replace
those.  I use preprocessing anyways to get rid of Org markup that
might confuse the language models.  Anyone here who did that
already?  Maybe gruut [5] in conjunction with SSML [6] handling?

Any other suggestions?

Best wishes
Jens

[1] https://gitlab.com/oer/emacs-reveal/-/issues/20
[2] https://github.com/coqui-ai/TTS/
[3] https://huggingface.co/microsoft/speecht5_tts
[4] https://github.com/suno-ai/bark
[5] https://github.com/rhasspy/gruut
[6] https://www.w3.org/TR/speech-synthesis11/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-09 18:05 Suggestions for Text-To-Speech (TTS) from Org sources? Jens Lechtenboerger
@ 2023-09-09 21:20 ` briangpowell
  2023-09-10 14:35   ` Jens Lechtenboerger
  2023-09-10 10:43 ` Ihor Radchenko
  2023-09-28 13:11 ` Jens Lechtenboerger
  2 siblings, 1 reply; 19+ messages in thread
From: briangpowell @ 2023-09-09 21:20 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1575 bytes --]

I've turned OrgMode files into audio desktops

It was pretty simple

Just find the code that reveals what an icon is when you hover over it &
pipe it to some text-to-speech engine & then on to usual routes

On Sat, Sep 9, 2023 at 2:06 PM Jens Lechtenboerger <
lechten@wi.uni-muenster.de> wrote:

> Dear all,
>
> does someone here produce audio via Text-To-Speech (TTS) from Org
> sources?  I plan to do that in the context of emacs-reveal to
> generate voice-over for reveal.js presentations, with open questions
> [1] concerning my initial, experimental approach.
>
> Currently, I like the default model of Coqui-AI TTS [2] and
> Microsoft SpeechT5 [3] best.  Any suggestions for free and open TTS
> implementations that produce even better results?  Other models of
> Coqui-AI?  The solution should work without GPU support, which seems
> to rule out Suno Bark [4].
>
> The above models do not pronounce numbers/digits, and they fail to
> pronounce most acronyms.  In a preprocessing step I could replace
> those.  I use preprocessing anyways to get rid of Org markup that
> might confuse the language models.  Anyone here who did that
> already?  Maybe gruut [5] in conjunction with SSML [6] handling?
>
> Any other suggestions?
>
> Best wishes
> Jens
>
> [1] https://gitlab.com/oer/emacs-reveal/-/issues/20
> [2] https://github.com/coqui-ai/TTS/
> [3] https://huggingface.co/microsoft/speecht5_tts
> [4] https://github.com/suno-ai/bark
> [5] https://github.com/rhasspy/gruut
> [6] https://www.w3.org/TR/speech-synthesis11/
>
>

[-- Attachment #2: Type: text/html, Size: 2500 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-09 18:05 Suggestions for Text-To-Speech (TTS) from Org sources? Jens Lechtenboerger
  2023-09-09 21:20 ` briangpowell
@ 2023-09-10 10:43 ` Ihor Radchenko
  2023-09-10 14:39   ` Jens Lechtenboerger
  2023-09-28 13:11 ` Jens Lechtenboerger
  2 siblings, 1 reply; 19+ messages in thread
From: Ihor Radchenko @ 2023-09-10 10:43 UTC (permalink / raw)
  To: Jens Lechtenboerger; +Cc: emacs-orgmode

Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:

> does someone here produce audio via Text-To-Speech (TTS) from Org
> sources?  I plan to do that in the context of emacs-reveal to
> generate voice-over for reveal.js presentations, with open questions
> [1] concerning my initial, experimental approach.

Emacspeak is a mature Emacs solution for TTS. However, it aims blind
users, not presentations. Still,
http://tvraman.github.io/emacspeak/manual/Quick-Installation.html might
be a good starting point for TTS options.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-09 21:20 ` briangpowell
@ 2023-09-10 14:35   ` Jens Lechtenboerger
  0 siblings, 0 replies; 19+ messages in thread
From: Jens Lechtenboerger @ 2023-09-10 14:35 UTC (permalink / raw)
  To: briangpowell; +Cc: emacs-orgmode

On 2023-09-09, briangpowell wrote:

> I've turned OrgMode files into audio desktops
>
> It was pretty simple
>
> Just find the code that reveals what an icon is when you hover over it &
> pipe it to some text-to-speech engine & then on to usual routes

Thank you for the reply.  In my case (GNU/Linux), I guess that the
usual text-to-speech engine is espeak.  I find its speech quality to
be disappointing (much worse than the language models that I
mentioned earlier).  I am looking for (near-) human quality...

Best wishes
Jens

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-10 10:43 ` Ihor Radchenko
@ 2023-09-10 14:39   ` Jens Lechtenboerger
  2023-09-10 20:08     ` Christian Thäter
  0 siblings, 1 reply; 19+ messages in thread
From: Jens Lechtenboerger @ 2023-09-10 14:39 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

On 2023-09-10, Ihor Radchenko wrote:

> Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:
>
>> does someone here produce audio via Text-To-Speech (TTS) from Org
>> sources?  I plan to do that in the context of emacs-reveal to
>> generate voice-over for reveal.js presentations, with open questions
>> [1] concerning my initial, experimental approach.
>
> Emacspeak is a mature Emacs solution for TTS. However, it aims blind
> users, not presentations. Still,
> http://tvraman.github.io/emacspeak/manual/Quick-Installation.html might
> be a good starting point for TTS options.

Thank you for the suggestion.  With espeak this indeed pronounces
numbers and abbreviations but its audio quality it not good enough
for my purposes.  I am looking for (near-) human voices...

Best wishes
Jens


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-10 14:39   ` Jens Lechtenboerger
@ 2023-09-10 20:08     ` Christian Thäter
  2023-09-11  8:33       ` Jens Lechtenboerger
  2023-09-11  9:14       ` briangpowell
  0 siblings, 2 replies; 19+ messages in thread
From: Christian Thäter @ 2023-09-10 20:08 UTC (permalink / raw)
  To: emacs-orgmode

On Sun, 10 Sep 2023 16:39:26 +0200
Jens Lechtenboerger <lechten@wi.uni-muenster.de> wrote:

> On 2023-09-10, Ihor Radchenko wrote:
> 
> > Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:
> >  
> >> does someone here produce audio via Text-To-Speech (TTS) from Org
> >> sources?  I plan to do that in the context of emacs-reveal to
> >> generate voice-over for reveal.js presentations, with open
> >> questions [1] concerning my initial, experimental approach.  
> >
> > Emacspeak is a mature Emacs solution for TTS. However, it aims blind
> > users, not presentations. Still,
> > http://tvraman.github.io/emacspeak/manual/Quick-Installation.html
> > might be a good starting point for TTS options.  
> 
> Thank you for the suggestion.  With espeak this indeed pronounces
> numbers and abbreviations but its audio quality it not good enough
> for my purposes.  I am looking for (near-) human voices...

using mbrola is probably as good as possible with free software:
https://en.wikipedia.org/wiki/MBROLA

still not perfect, but much better than the builtin voices of espeak or
festival (YYMV).

> 
> Best wishes
> Jens
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-10 20:08     ` Christian Thäter
@ 2023-09-11  8:33       ` Jens Lechtenboerger
  2023-09-11  9:14       ` briangpowell
  1 sibling, 0 replies; 19+ messages in thread
From: Jens Lechtenboerger @ 2023-09-11  8:33 UTC (permalink / raw)
  To: Christian Thäter; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1272 bytes --]

On 2023-09-10, Christian Thäter wrote:

> On Sun, 10 Sep 2023 16:39:26 +0200
> Jens Lechtenboerger <lechten@wi.uni-muenster.de> wrote:
>
>> On 2023-09-10, Ihor Radchenko wrote:
>> 
>> > Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:
>> >  
>> >> does someone here produce audio via Text-To-Speech (TTS) from Org
>> >> sources?  I plan to do that in the context of emacs-reveal to
>> >> generate voice-over for reveal.js presentations, with open
>> >> questions [1] concerning my initial, experimental approach.  
>> >
>> > Emacspeak is a mature Emacs solution for TTS. However, it aims blind
>> > users, not presentations. Still,
>> > http://tvraman.github.io/emacspeak/manual/Quick-Installation.html
>> > might be a good starting point for TTS options.  
>> 
>> Thank you for the suggestion.  With espeak this indeed pronounces
>> numbers and abbreviations but its audio quality it not good enough
>> for my purposes.  I am looking for (near-) human voices...
>
> using mbrola is probably as good as possible with free software:
> https://en.wikipedia.org/wiki/MBROLA
>
> still not perfect, but much better than the builtin voices of espeak or
> festival (YYMV).

This sounds promising.  I’ll check it out.

Many thanks
Jens

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 6187 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-10 20:08     ` Christian Thäter
  2023-09-11  8:33       ` Jens Lechtenboerger
@ 2023-09-11  9:14       ` briangpowell
  2023-09-11 12:06         ` Jude DaShiell
                           ` (2 more replies)
  1 sibling, 3 replies; 19+ messages in thread
From: briangpowell @ 2023-09-11  9:14 UTC (permalink / raw)
  To: Christian Thäter; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 2178 bytes --]

* eSpeak seems to focus on small footprints & a "format synthesis" method

* Suggest using Festival with MBrola:

https://www.cstr.ed.ac.uk/projects/festival/mbrola.html

https://www.cstr.ed.ac.uk/projects/festival/

and/or just install FestivalLite:

apt-get install -f -y --force-yes flite

* Note EmacSpeak {mentioned in another email} is written by OrgMode user &
programmer TV Raman--not sure EmacSpeak will help you at all; but it might
be interesting for you

** Klaus Knopper distributes some very interesting free software that
includes an audio-desktop called ADRIANE that maybe you can look at--I'd
love to hear what you find out if you do:

https://www.knopper.net/knoppix-adriane/index-en.html

** Knopper invented the "run Linux entirely from a cdrom" craze--which
still is very useful in many ways--suggest you give Knoppix & Adriane a look

On Mon, Sep 11, 2023 at 4:02 AM Christian Thäter <ct@pipapo.org> wrote:

> On Sun, 10 Sep 2023 16:39:26 +0200
> Jens Lechtenboerger <lechten@wi.uni-muenster.de> wrote:
>
> > On 2023-09-10, Ihor Radchenko wrote:
> >
> > > Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:
> > >
> > >> does someone here produce audio via Text-To-Speech (TTS) from Org
> > >> sources?  I plan to do that in the context of emacs-reveal to
> > >> generate voice-over for reveal.js presentations, with open
> > >> questions [1] concerning my initial, experimental approach.
> > >
> > > Emacspeak is a mature Emacs solution for TTS. However, it aims blind
> > > users, not presentations. Still,
> > > http://tvraman.github.io/emacspeak/manual/Quick-Installation.html
> > > might be a good starting point for TTS options.
> >
> > Thank you for the suggestion.  With espeak this indeed pronounces
> > numbers and abbreviations but its audio quality it not good enough
> > for my purposes.  I am looking for (near-) human voices...
>
> using mbrola is probably as good as possible with free software:
> https://en.wikipedia.org/wiki/MBROLA
>
> still not perfect, but much better than the builtin voices of espeak or
> festival (YYMV).
>
> >
> > Best wishes
> > Jens
> >
>
>
>

[-- Attachment #2: Type: text/html, Size: 3559 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-11  9:14       ` briangpowell
@ 2023-09-11 12:06         ` Jude DaShiell
  2023-09-11 12:27           ` tomas
  2023-09-11 12:07         ` Jude DaShiell
  2023-09-11 12:31         ` Jens Lechtenboerger
  2 siblings, 1 reply; 19+ messages in thread
From: Jude DaShiell @ 2023-09-11 12:06 UTC (permalink / raw)
  To: briangpowell, Christian Thäter; +Cc: emacs-orgmode

fenrir-screenreader is also available.
https://nashcentral.duckdns.org/projects/Jenux
uses fenrir by default.
Klaus Knopper's public key I haven't been able to find and none of his
email addresses seem to be working any longer either for the ones I found.
You have a chance of getting a good version of knoppix if you download
with a good bittorrent client and make sure your encryption required is
turned on and make sure of integrity checks.


-- 
Jude <jdashiel at panix dot com>
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo.
Please use in that order."
Ed Howdershelt 1940.

On Mon, 11 Sep 2023, briangpowell wrote:

> * eSpeak seems to focus on small footprints & a "format synthesis" method
>
> * Suggest using Festival with MBrola:
>
> https://www.cstr.ed.ac.uk/projects/festival/mbrola.html
>
> https://www.cstr.ed.ac.uk/projects/festival/
>
> and/or just install FestivalLite:
>
> apt-get install -f -y --force-yes flite
>
> * Note EmacSpeak {mentioned in another email} is written by OrgMode user &
> programmer TV Raman--not sure EmacSpeak will help you at all; but it might
> be interesting for you
>
> ** Klaus Knopper distributes some very interesting free software that
> includes an audio-desktop called ADRIANE that maybe you can look at--I'd
> love to hear what you find out if you do:
>
> https://www.knopper.net/knoppix-adriane/index-en.html
>
> ** Knopper invented the "run Linux entirely from a cdrom" craze--which
> still is very useful in many ways--suggest you give Knoppix & Adriane a look
>
> On Mon, Sep 11, 2023 at 4:02 AM Christian Thäter <ct@pipapo.org> wrote:
>
> > On Sun, 10 Sep 2023 16:39:26 +0200
> > Jens Lechtenboerger <lechten@wi.uni-muenster.de> wrote:
> >
> > > On 2023-09-10, Ihor Radchenko wrote:
> > >
> > > > Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:
> > > >
> > > >> does someone here produce audio via Text-To-Speech (TTS) from Org
> > > >> sources?  I plan to do that in the context of emacs-reveal to
> > > >> generate voice-over for reveal.js presentations, with open
> > > >> questions [1] concerning my initial, experimental approach.
> > > >
> > > > Emacspeak is a mature Emacs solution for TTS. However, it aims blind
> > > > users, not presentations. Still,
> > > > http://tvraman.github.io/emacspeak/manual/Quick-Installation.html
> > > > might be a good starting point for TTS options.
> > >
> > > Thank you for the suggestion.  With espeak this indeed pronounces
> > > numbers and abbreviations but its audio quality it not good enough
> > > for my purposes.  I am looking for (near-) human voices...
> >
> > using mbrola is probably as good as possible with free software:
> > https://en.wikipedia.org/wiki/MBROLA
> >
> > still not perfect, but much better than the builtin voices of espeak or
> > festival (YYMV).
> >
> > >
> > > Best wishes
> > > Jens
> > >
> >
> >
> >
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-11  9:14       ` briangpowell
  2023-09-11 12:06         ` Jude DaShiell
@ 2023-09-11 12:07         ` Jude DaShiell
  2023-09-11 12:31         ` Jens Lechtenboerger
  2 siblings, 0 replies; 19+ messages in thread
From: Jude DaShiell @ 2023-09-11 12:07 UTC (permalink / raw)
  To: briangpowell, Christian Thäter; +Cc: emacs-orgmode

espeak-ng is a fork of espeak and can use speechdispatcher.


-- 
Jude <jdashiel at panix dot com>
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo.
Please use in that order."
Ed Howdershelt 1940.

On Mon, 11 Sep 2023, briangpowell wrote:

> * eSpeak seems to focus on small footprints & a "format synthesis" method
>
> * Suggest using Festival with MBrola:
>
> https://www.cstr.ed.ac.uk/projects/festival/mbrola.html
>
> https://www.cstr.ed.ac.uk/projects/festival/
>
> and/or just install FestivalLite:
>
> apt-get install -f -y --force-yes flite
>
> * Note EmacSpeak {mentioned in another email} is written by OrgMode user &
> programmer TV Raman--not sure EmacSpeak will help you at all; but it might
> be interesting for you
>
> ** Klaus Knopper distributes some very interesting free software that
> includes an audio-desktop called ADRIANE that maybe you can look at--I'd
> love to hear what you find out if you do:
>
> https://www.knopper.net/knoppix-adriane/index-en.html
>
> ** Knopper invented the "run Linux entirely from a cdrom" craze--which
> still is very useful in many ways--suggest you give Knoppix & Adriane a look
>
> On Mon, Sep 11, 2023 at 4:02 AM Christian Thäter <ct@pipapo.org> wrote:
>
> > On Sun, 10 Sep 2023 16:39:26 +0200
> > Jens Lechtenboerger <lechten@wi.uni-muenster.de> wrote:
> >
> > > On 2023-09-10, Ihor Radchenko wrote:
> > >
> > > > Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:
> > > >
> > > >> does someone here produce audio via Text-To-Speech (TTS) from Org
> > > >> sources?  I plan to do that in the context of emacs-reveal to
> > > >> generate voice-over for reveal.js presentations, with open
> > > >> questions [1] concerning my initial, experimental approach.
> > > >
> > > > Emacspeak is a mature Emacs solution for TTS. However, it aims blind
> > > > users, not presentations. Still,
> > > > http://tvraman.github.io/emacspeak/manual/Quick-Installation.html
> > > > might be a good starting point for TTS options.
> > >
> > > Thank you for the suggestion.  With espeak this indeed pronounces
> > > numbers and abbreviations but its audio quality it not good enough
> > > for my purposes.  I am looking for (near-) human voices...
> >
> > using mbrola is probably as good as possible with free software:
> > https://en.wikipedia.org/wiki/MBROLA
> >
> > still not perfect, but much better than the builtin voices of espeak or
> > festival (YYMV).
> >
> > >
> > > Best wishes
> > > Jens
> > >
> >
> >
> >
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-11 12:06         ` Jude DaShiell
@ 2023-09-11 12:27           ` tomas
  2023-09-11 13:52             ` Jude DaShiell
  2023-09-11 14:48             ` Jude DaShiell
  0 siblings, 2 replies; 19+ messages in thread
From: tomas @ 2023-09-11 12:27 UTC (permalink / raw)
  To: Jude DaShiell; +Cc: briangpowell, Christian Thäter, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 790 bytes --]

On Mon, Sep 11, 2023 at 08:06:34AM -0400, Jude DaShiell wrote:
> fenrir-screenreader is also available.
> https://nashcentral.duckdns.org/projects/Jenux
> uses fenrir by default.
> Klaus Knopper's public key I haven't been able to find and none of his
> email addresses seem to be working any longer either for the ones I found.

The knoppix signing keys seem to be around here:

  http://ftp.knoppix.net/wiki/Downloading_FAQ

> You have a chance of getting a good version of knoppix if you download
> with a good bittorrent client and make sure your encryption required is
> turned on and make sure of integrity checks.

Look here:

  http://knoppix.net/

...and consider buying a CD (yes, that's still a thing ;-) or a stick
to support development.

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-11  9:14       ` briangpowell
  2023-09-11 12:06         ` Jude DaShiell
  2023-09-11 12:07         ` Jude DaShiell
@ 2023-09-11 12:31         ` Jens Lechtenboerger
  2 siblings, 0 replies; 19+ messages in thread
From: Jens Lechtenboerger @ 2023-09-11 12:31 UTC (permalink / raw)
  To: briangpowell; +Cc: Christian Thäter, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 2579 bytes --]

Thank you for the additional pointers!  I still need to check out
promising combinations of those approaches, also options for MBROLA
(which is not free, but applies a custom AGPL-3.0-but-not-be-sold
license to the voices).  ADRIANE is certainly fascinating.

Best wishes
Jens

On 2023-09-11, briangpowell wrote:

> * eSpeak seems to focus on small footprints & a "format synthesis" method
>
> * Suggest using Festival with MBrola:
>
> https://www.cstr.ed.ac.uk/projects/festival/mbrola.html
>
> https://www.cstr.ed.ac.uk/projects/festival/
>
> and/or just install FestivalLite:
>
> apt-get install -f -y --force-yes flite
>
> * Note EmacSpeak {mentioned in another email} is written by OrgMode user &
> programmer TV Raman--not sure EmacSpeak will help you at all; but it might
> be interesting for you
>
> ** Klaus Knopper distributes some very interesting free software that
> includes an audio-desktop called ADRIANE that maybe you can look at--I'd
> love to hear what you find out if you do:
>
> https://www.knopper.net/knoppix-adriane/index-en.html
>
> ** Knopper invented the "run Linux entirely from a cdrom" craze--which
> still is very useful in many ways--suggest you give Knoppix & Adriane a look
>
> On Mon, Sep 11, 2023 at 4:02 AM Christian Thäter <ct@pipapo.org> wrote:
>
>> On Sun, 10 Sep 2023 16:39:26 +0200
>> Jens Lechtenboerger <lechten@wi.uni-muenster.de> wrote:
>>
>> > On 2023-09-10, Ihor Radchenko wrote:
>> >
>> > > Jens Lechtenboerger <lechten@wi.uni-muenster.de> writes:
>> > >
>> > >> does someone here produce audio via Text-To-Speech (TTS) from Org
>> > >> sources?  I plan to do that in the context of emacs-reveal to
>> > >> generate voice-over for reveal.js presentations, with open
>> > >> questions [1] concerning my initial, experimental approach.
>> > >
>> > > Emacspeak is a mature Emacs solution for TTS. However, it aims blind
>> > > users, not presentations. Still,
>> > > http://tvraman.github.io/emacspeak/manual/Quick-Installation.html
>> > > might be a good starting point for TTS options.
>> >
>> > Thank you for the suggestion.  With espeak this indeed pronounces
>> > numbers and abbreviations but its audio quality it not good enough
>> > for my purposes.  I am looking for (near-) human voices...
>>
>> using mbrola is probably as good as possible with free software:
>> https://en.wikipedia.org/wiki/MBROLA
>>
>> still not perfect, but much better than the builtin voices of espeak or
>> festival (YYMV).
>>
>> >
>> > Best wishes
>> > Jens
>> >
>>
>>
>>

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 6187 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-11 12:27           ` tomas
@ 2023-09-11 13:52             ` Jude DaShiell
  2023-09-11 16:30               ` tomas
  2023-09-11 14:48             ` Jude DaShiell
  1 sibling, 1 reply; 19+ messages in thread
From: Jude DaShiell @ 2023-09-11 13:52 UTC (permalink / raw)
  To: tomas; +Cc: briangpowell, Christian Thäter, emacs-orgmode

Why does this happen?

The gpg command given in the faq does not return what the faq claims will
be returned.
Look for the public key of Klaus Knopper:
bash: Look: command not found
bash-5.1$  gpg --keyserver pool.sks-keyservers.net --search-keys "Klaus
Knopper"
gpg: error searching keyserver: Server indicated a failure
gpg: keyserver search failed: Server indicated a failure
bash-5.1$
bash-5.1$


-- 
Jude <jdashiel at panix dot com>
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo.
Please use in that order."
Ed Howdershelt 1940.

On Mon, 11 Sep 2023, tomas@tuxteam.de wrote:

> On Mon, Sep 11, 2023 at 08:06:34AM -0400, Jude DaShiell wrote:
> > fenrir-screenreader is also available.
> > https://nashcentral.duckdns.org/projects/Jenux
> > uses fenrir by default.
> > Klaus Knopper's public key I haven't been able to find and none of his
> > email addresses seem to be working any longer either for the ones I found.
>
> The knoppix signing keys seem to be around here:
>
>   http://ftp.knoppix.net/wiki/Downloading_FAQ
>
> > You have a chance of getting a good version of knoppix if you download
> > with a good bittorrent client and make sure your encryption required is
> > turned on and make sure of integrity checks.
>
> Look here:
>
>   http://knoppix.net/
>
> ...and consider buying a CD (yes, that's still a thing ;-) or a stick
> to support development.
>
> Cheers
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-11 12:27           ` tomas
  2023-09-11 13:52             ` Jude DaShiell
@ 2023-09-11 14:48             ` Jude DaShiell
  1 sibling, 0 replies; 19+ messages in thread
From: Jude DaShiell @ 2023-09-11 14:48 UTC (permalink / raw)
  To: tomas; +Cc: briangpowell, Christian Thäter, emacs-orgmode

fenrir-screenreader is installable with pip though you may need some
support getting it set up and configured.


-- Jude <jdashiel at panix dot com> "There are four boxes to be used in
defense of liberty: soap, ballot, jury, and ammo. Please use in that
order." Ed Howdershelt 1940.

On Mon, 11 Sep 2023, tomas@tuxteam.de wrote:

> On Mon, Sep 11, 2023 at 08:06:34AM -0400, Jude DaShiell wrote:
> > fenrir-screenreader is also available.
> > https://nashcentral.duckdns.org/projects/Jenux
> > uses fenrir by default.
> > Klaus Knopper's public key I haven't been able to find and none of his
> > email addresses seem to be working any longer either for the ones I found.
>
> The knoppix signing keys seem to be around here:
>
>   http://ftp.knoppix.net/wiki/Downloading_FAQ
>
> > You have a chance of getting a good version of knoppix if you download
> > with a good bittorrent client and make sure your encryption required is
> > turned on and make sure of integrity checks.
>
> Look here:
>
>   http://knoppix.net/
>
> ...and consider buying a CD (yes, that's still a thing ;-) or a stick
> to support development.
>
> Cheers
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-11 13:52             ` Jude DaShiell
@ 2023-09-11 16:30               ` tomas
  2023-09-11 17:21                 ` Jude DaShiell
  0 siblings, 1 reply; 19+ messages in thread
From: tomas @ 2023-09-11 16:30 UTC (permalink / raw)
  To: Jude DaShiell; +Cc: briangpowell, Christian Thäter, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 475 bytes --]

On Mon, Sep 11, 2023 at 09:52:30AM -0400, Jude DaShiell wrote:
> Why does this happen?
> 
> The gpg command given in the faq does not return what the faq claims will
> be returned.
> Look for the public key of Klaus Knopper:
> bash: Look: command not found
> bash-5.1$  gpg --keyserver pool.sks-keyservers.net --search-keys "Klaus
> Knopper"

Hm. Keyservers seem to be a dying species these days. Try

  --keyserver hkp://keyserver.ubuntu.com/

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-11 16:30               ` tomas
@ 2023-09-11 17:21                 ` Jude DaShiell
  0 siblings, 0 replies; 19+ messages in thread
From: Jude DaShiell @ 2023-09-11 17:21 UTC (permalink / raw)
  To: tomas; +Cc: briangpowell, Christian Thäter, emacs-orgmode

Thanks much, that one worked.


-- Jude <jdashiel at panix dot com> "There are four boxes to be used in
defense of liberty: soap, ballot, jury, and ammo. Please use in that
order." Ed Howdershelt 1940.

On Mon, 11 Sep 2023, tomas@tuxteam.de wrote:

> On Mon, Sep 11, 2023 at 09:52:30AM -0400, Jude DaShiell wrote:
> > Why does this happen?
> >
> > The gpg command given in the faq does not return what the faq claims will
> > be returned.
> > Look for the public key of Klaus Knopper:
> > bash: Look: command not found
> > bash-5.1$  gpg --keyserver pool.sks-keyservers.net --search-keys "Klaus
> > Knopper"
>
> Hm. Keyservers seem to be a dying species these days. Try
>
>   --keyserver hkp://keyserver.ubuntu.com/
>
> Cheers
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-09 18:05 Suggestions for Text-To-Speech (TTS) from Org sources? Jens Lechtenboerger
  2023-09-09 21:20 ` briangpowell
  2023-09-10 10:43 ` Ihor Radchenko
@ 2023-09-28 13:11 ` Jens Lechtenboerger
  2023-09-28 14:16   ` Jude DaShiell
  2 siblings, 1 reply; 19+ messages in thread
From: Jens Lechtenboerger @ 2023-09-28 13:11 UTC (permalink / raw)
  To: emacs-orgmode

Dear all,

some time ago I asked for suggestions concerning Text-To-Speech
(TTS) from Org sources.  Thank you to everyone who provided
suggestions!  In case you are interested, you can listen to sample
results at [1].

Briefly, Emacspeak, espeak-ng, and festival are not good enough for
my purposes.  Maybe I'm missing relevant backend options.  IMHO,
Coqui-AI TTS [2] and Microsoft SpeechT5 [3] are far superior.

Best wishes
Jens

[1] https://gitlab.com/oer/emacs-reveal/-/wikis/Sample-TTS-results
[2] https://github.com/coqui-ai/TTS/
[3] https://huggingface.co/microsoft/speecht5_tts

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-28 13:11 ` Jens Lechtenboerger
@ 2023-09-28 14:16   ` Jude DaShiell
  2023-09-29  7:56     ` Jens Lechtenboerger
  0 siblings, 1 reply; 19+ messages in thread
From: Jude DaShiell @ 2023-09-28 14:16 UTC (permalink / raw)
  To: Jens Lechtenboerger, emacs-orgmode

espeak-ng likes to have speechdispatcher on a system and festival likes to
have language-specific voices on it to use.
fenrir which you didn't mention runs in user land and has no kernel
dependencies.


-- Jude <jdashiel at panix dot com> "There are four boxes to be used in
defense of liberty: soap, ballot, jury, and ammo. Please use in that
order." Ed Howdershelt 1940.

On Thu, 28 Sep 2023, Jens Lechtenboerger wrote:

> Dear all,
>
> some time ago I asked for suggestions concerning Text-To-Speech
> (TTS) from Org sources.  Thank you to everyone who provided
> suggestions!  In case you are interested, you can listen to sample
> results at [1].
>
> Briefly, Emacspeak, espeak-ng, and festival are not good enough for
> my purposes.  Maybe I'm missing relevant backend options.  IMHO,
> Coqui-AI TTS [2] and Microsoft SpeechT5 [3] are far superior.
>
> Best wishes
> Jens
>
> [1] https://gitlab.com/oer/emacs-reveal/-/wikis/Sample-TTS-results
> [2] https://github.com/coqui-ai/TTS/
> [3] https://huggingface.co/microsoft/speecht5_tts
>
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Suggestions for Text-To-Speech (TTS) from Org sources?
  2023-09-28 14:16   ` Jude DaShiell
@ 2023-09-29  7:56     ` Jens Lechtenboerger
  0 siblings, 0 replies; 19+ messages in thread
From: Jens Lechtenboerger @ 2023-09-29  7:56 UTC (permalink / raw)
  To: Jude DaShiell; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 603 bytes --]

On 2023-09-28, Jude DaShiell wrote:

> espeak-ng likes to have speechdispatcher on a system

Does this improve the quality of the generated speech?

> and festival likes to have language-specific voices on it to use.

Indeed.  Which one(s) do you recommend?  I tried
voice_cmu_us_slt_arctic_hts and the mbrola us voices.

> fenrir which you didn't mention runs in user land and has no kernel
> dependencies.

I briefly tried that but failed.  Also, I was under the impression
that fenrir does produce speech by itself but relies on some TTS
implementation like espeak.  Is that wrong?

Best wishes
Jens

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 6187 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-09-29  8:02 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-09 18:05 Suggestions for Text-To-Speech (TTS) from Org sources? Jens Lechtenboerger
2023-09-09 21:20 ` briangpowell
2023-09-10 14:35   ` Jens Lechtenboerger
2023-09-10 10:43 ` Ihor Radchenko
2023-09-10 14:39   ` Jens Lechtenboerger
2023-09-10 20:08     ` Christian Thäter
2023-09-11  8:33       ` Jens Lechtenboerger
2023-09-11  9:14       ` briangpowell
2023-09-11 12:06         ` Jude DaShiell
2023-09-11 12:27           ` tomas
2023-09-11 13:52             ` Jude DaShiell
2023-09-11 16:30               ` tomas
2023-09-11 17:21                 ` Jude DaShiell
2023-09-11 14:48             ` Jude DaShiell
2023-09-11 12:07         ` Jude DaShiell
2023-09-11 12:31         ` Jens Lechtenboerger
2023-09-28 13:11 ` Jens Lechtenboerger
2023-09-28 14:16   ` Jude DaShiell
2023-09-29  7:56     ` Jens Lechtenboerger

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).