Text to speech
Learn how to generate speech from text
Alting offers a standarized API for interacting with text-to-speech models. This guide will show you how to generate speech from text using one of our supported models. Our text-to-speech models can be used for various use cases, such as:
- Narrate a written story
- Produce multi-lingual speech
- Generate real-time speech using streaming
Quickstart
To generate speech from text, you can use the speech endpoint in the REST API, as seen in the examples below. We recommend using either our REST API using your HTTP client of choice, or the OpenAI SDK for your language of choice.
Note: Currently, the output format only supports MP3.
Choosing a model
When making a text-to-speech request, the first thing you need to decide is which model to use. We currently support the OpenAI TTS-1 and TTS-1-HD models. ElevenLabs models will be available soon.
You can experiment with different models in our Text to Speech app.
Choosing a voice
When generating speech, you can choose from a variety of voices. Voices are limited to the models they are associated with. To get a list of voices supported by a model, you can use the voices endpoint. See examples below.