Zero Signup ToolsFree browser tools

Generator Tools

SSML Builder

Build valid SSML markup in your browser for Amazon Polly, Alexa, Google, and Azure. Add pauses, prosody, say-as, emphasis, and pronunciation. No signup.

Add block
Samples

Amazon Polly, Alexa, Google Cloud, and Azure require the <speak> root. Turn it off only when you paste the markup inside an existing <speak> element.

Builder

4 blocks

1. Plain text

2. Emphasis

3. Pause (break)

4. Plain text

<speak>
  Welcome to
  <emphasis level="strong">Acme Support</emphasis>
  <break time="400ms"/>
  How can I help you today?
</speak>

Plain text preview

Welcome to Acme Support, How can I help you today?

Speak preview

Browsers do not speak raw SSML, so this reads the plain text above with your system voice. It checks the words and order, not the pauses or prosody. Use your engine console (Polly, Google, Azure) to hear the real SSML.

Checking browser support...

SSML tags in this builder

These tags belong to the W3C Speech Synthesis Markup Language and work across the major engines. Some engines add their own tags (for example Amazon's effects or Google's voice selection), which you can paste in alongside this output.

Timing and structure

  • <break> inserts a pause by time or strength.
  • <p> and <s> mark paragraphs and sentences.

Sound and stress

  • <prosody> sets rate, pitch, and volume.
  • <emphasis> stresses a word or phrase.

How words are read

  • <say-as> reads numbers, dates, and digits correctly.
  • <sub> speaks an alias instead of the written text.

Pronunciation

  • <phoneme> spells a pronunciation in IPA or X-SAMPA.
  • Text content is XML-escaped automatically so the markup stays valid.

Engine support

  • Amazon Polly and Alexa
  • Google Cloud Text-to-Speech
  • Microsoft Azure Speech
  • IBM Watson Text to Speech

Good to know

  • Keep the <speak> root unless you embed this fragment.
  • Percent, decibel, semitone, and time values follow the SSML spec.
  • Some engines ignore tags they do not support rather than failing.

How to use

  1. Start from a sample, or use the Add block buttons to add Plain text, Pause, Prosody, Emphasis, Say-as, Substitution, Phoneme, Paragraph, or Sentence blocks.
  2. Fill in each block: type the spoken text, set a pause time or strength, choose rate, pitch, and volume for prosody, or pick how Say-as should read numbers and dates.
  3. Reorder blocks with Up and Down, remove any you do not need, and watch the SSML output update live. Notices flag empty text or values outside the SSML range.
  4. Keep the Wrap output in speak root toggle on for Polly, Alexa, Google, and Azure. Turn it off only when you paste the markup inside an existing speak element.
  5. Use Speak preview to hear the plain words with your system voice (it does not render SSML prosody), then click Copy SSML and paste it into your text to speech engine.

About this tool

SSML Builder helps you write Speech Synthesis Markup Language without memorizing the tags or hand balancing angle brackets. SSML is the W3C standard markup that controls how a text to speech engine speaks: where it pauses, how fast and high it talks, which words it stresses, and how it pronounces numbers, dates, abbreviations, and tricky names. Every major engine reads a compatible core of SSML, so the markup this tool produces works in Amazon Polly and Alexa, Google Cloud Text-to-Speech, Microsoft Azure Speech, and IBM Watson Text to Speech. You build a voice line as an ordered list of blocks and the tool assembles valid, indented SSML live as you edit. The block types cover the elements people actually reach for. A Plain text block is spoken as written. A Pause block inserts a break, either by exact time (for example 500ms or 1s) or by one of the named strengths from x-weak to x-strong. A Prosody block sets rate, pitch, and volume, accepting both the spec keywords (slow, medium, fast, low, high, soft, loud) and precise values such as a percentage, a decibel offset like +6dB, or a semitone offset like +2st, with inline validation that flags anything the engines would reject. An Emphasis block stresses a phrase at reduced, moderate, or strong level. A Say-as block is the one that fixes the most common complaints: it tells the engine to read a string as a cardinal number, an ordinal, individual digits, a date with a chosen format, a time, a telephone number, a unit, a fraction, an address, or to spell it out character by character. A Substitution block speaks an alias in place of the written text, which is how you make an acronym or a brand name sound right. A Phoneme block spells out an exact pronunciation in IPA or X-SAMPA. Paragraph and Sentence blocks add structure that some engines use for natural phrasing. Every block can be reordered or removed, all text content is XML escaped automatically so the markup stays valid, and a notices panel surfaces empty fields and out of range values before you paste the result into your engine. The output can be wrapped in the required speak root or emitted as a bare fragment when you are dropping it inside an existing document. A speak preview reads the plain text of your line with the browser Web Speech API so you can sanity check the words and their order; browsers do not render raw SSML, so the preview does not reproduce the pauses or prosody, and the tool says so plainly. Everything runs locally in your browser, so the text and markup you build are never uploaded.

Free to use. Works in your browser. No signup, no login.

Related tools

You may also like

All tools
All toolsGenerator Tools