Hi. 😄
We can use this application to utter our text.
To search a particular speech synthesis model, we can clear the input first then type anything. For instance, english, italiano, google, or microsoft (if we're using Windows OS). The input is above the text Type or click to search voice
.
This application implements SpeechSynthesis
(Web Speech API).
About
This demonstration is using our operating system built-in speech models (either Windows or macOS — with no single built-in TTS system across all Linux distros) and Google's.
Other example is
Doctor Ossbita
We can't "record" the utterance audio output programmatically with JS or browser extension script because:
- It's not a media stream like
getUserMedia
. - It does not emit audio that JavaScript can capture — it bypasses
AudioContext
,MediaStream
, etc. - It's a one-way street straight to the system voice.
But we still can capture it with OBS Studio or similar screen/desktop recorder software (installed on operating system, not as browser extension script).
This (text to speech API in general) is useful so our application can produce speech, to make it more relatable to us. As in done in ChatGPT, Gemini, Siri, and other digital applications/services.
Text to Speech Service for Content Creation
For content creation, to narrate the text for a video, we can try:
-
ElevenLabs
Most popular with creators for its hyper-realistic, expressive voices. Allows voice cloning and emotional intonation. Free tier available, but "watermark" appears unless paid. Watermark = an additional audio overlay.
-
Play.ht
Known for natural-sounding voices. Web-based, quick to generate, decent voice library.
-
TikTok/Instagram Native TTS (powered by Amazon Polly or similar)
Built-in and very commonly used for convenience.
Sci-Fi
Did you notice every sci-fi, they always put that voice feedback from the digital contraption?
Like in Star Trek, where did they put the microphones and the loudspeakers? 🤔 The entire "ship" consists of grids of alternating microphones-loudspeakers? Well, mayhap. How did the computer differentiate "computer" and "computer", as in when people chattering about their instruction for computer instead of actually giving instruction? No off switch for the microphones I believe.
Comments
Post a Comment