Dictate your documents in Word
Dictation lets you use speech-to-text to author content in Microsoft 365 with a microphone and reliable internet connection. It's a quick and easy way to get your thoughts out, create drafts or outlines, and capture notes.
Start speaking to see text appear on the screen.
The dictation feature is only available to . |
How to use dictation
Tip: You can also start dictation with the keyboard shortcut: ⌥ (Option) + F1.
Learn more about using dictation in Word on the web and mobile
Dictate your documents in Word for the web
Dictate your documents in Word Mobile
What can I say?
In addition to dictating your content, you can speak commands to add punctuation, navigate around the page, and enter special characters.
You can see the commands in any supported language by going to Available languages . These are the commands for English.
Punctuation
|
|
. | |
, | |
? | |
! | |
new line | |
's | |
: | |
; | |
" " | |
- | |
... | |
' ' | |
( ) | |
[ ] | |
{ } |
|
Navigation and Selection
|
|
Creating lists
Adding comments.
Dictation commands
|
|
* | |
\ | |
/ | |
| | |
` | |
_ | |
— | |
– | |
¶ | |
§ | |
& | |
@ | |
© | |
® | |
° | |
^ |
Mathematics
|
| |
% | ||
# | ||
+ | ||
- | ||
x | ||
± | ||
÷ | ||
= | ||
< > |
|
|
$ | |
£ | |
€ | |
¥ |
Emoji/faces
|
|
:) | |
:( | |
;) | |
<3 |
Available languages
Select from the list below to see commands available in each of the supported languages.
- Select your language
Arabic (Bahrain)
Arabic (Egypt)
Arabic (Saudi Arabia)
Croatian (Croatia)
Gujarati (India)
- Hebrew (Israel)
- Hungarian (Hungary)
- Irish (Ireland)
Marathi (India)
- Polish (Poland)
- Romanian (Romania)
- Russian (Russia)
- Slovenian (Slovenia)
Tamil (India)
Telugu (India)
- Thai (Thailand)
- Vietnamese (Vietnam)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
More Information
Spoken languages supported.
By default, Dictation is set to your document language in Microsoft 365.
We are actively working to improve these languages and add more locales and languages.
Supported Languages
Chinese (China)
English (Australia)
English (Canada)
English (India)
English (United Kingdom)
English (United States)
French (Canada)
French (France)
German (Germany)
Italian (Italy)
Portuguese (Brazil)
Spanish (Spain)
Spanish (Mexico)
Preview languages *
Chinese (Traditional, Hong Kong)
Chinese (Taiwan)
Dutch (Netherlands)
English (New Zealand)
Norwegian (Bokmål)
Portuguese (Portugal)
Swedish (Sweden)
Turkish (Turkey)
* Preview Languages may have lower accuracy or limited punctuation support.
Dictation settings
Click on the gear icon to see the available settings.
Spoken Language: View and change languages in the drop-down
Microphone: View and change your microphone
Auto Punctuation: Toggle the checkmark on or off, if it's available for the language chosen
Profanity filter: Mask potentially sensitive phrases with ***
Tips for using Dictation
Saying “ delete ” by itself removes the last word or punctuation before the cursor.
Saying “ delete that ” removes the last spoken utterance.
You can bold, italicize, underline, or strikethrough a word or phrase. An example would be dictating “review by tomorrow at 5PM”, then saying “ bold tomorrow ” which would leave you with "review by tomorrow at 5PM"
Try phrases like “ bold last word ” or “ underline last sentence .”
Saying “ add comment look at this tomorrow ” will insert a new comment with the text “Look at this tomorrow” inside it.
Saying “ add comment ” by itself will create a blank comment box you where you can type a comment.
To resume dictation, please use the keyboard shortcut ALT + ` or press the Mic icon in the floating dictation menu.
Markings may appear under words with alternates we may have misheard.
If the marked word is already correct, you can select Ignore .
This service does not store your audio data or transcribed text.
Your speech utterances will be sent to Microsoft and used only to provide you with text results.
For more information about experiences that analyze your content, see Connected Experiences in Microsoft 365 .
Troubleshooting
Can't find the dictate button.
If you can't see the button to start dictation:
Make sure you're signed in with an active Microsoft 365 subscription
Dictate is not available in Office 2016 or 2019 for Windows without Microsoft 365
Make sure you have Windows 10 or above
Dictate button is grayed out
If you see the dictate button is grayed out
Make sure the note is not in a Read-Only state.
Microphone doesn't have access
If you see "We don’t have access to your microphone":
Make sure no other application or web page is using the microphone and try again
Refresh, click on Dictate, and give permission for the browser to access the microphone
Microphone isn't working
If you see "There is a problem with your microphone" or "We can’t detect your microphone":
Make sure the microphone is plugged in
Test the microphone to make sure it's working
Check the microphone settings in Control Panel
Also see How to set up and test microphones in Windows
On a Surface running Windows 10: Adjust microphone settings
Dictation can't hear you
If you see "Dictation can't hear you" or if nothing appears on the screen as you dictate:
Make sure your microphone is not muted
Adjust the input level of your microphone
Move to a quieter location
If using a built-in mic, consider trying again with a headset or external mic
Accuracy issues or missed words
If you see a lot of incorrect words being output or missed words:
Make sure you're on a fast and reliable internet connection
Avoid or eliminate background noise that may interfere with your voice
Try speaking more deliberately
Check to see if the microphone you are using needs to be upgraded
Need more help?
Want more options.
Explore subscription benefits, browse training courses, learn how to secure your device, and more.
Microsoft 365 subscription benefits
Microsoft 365 training
Microsoft security
Accessibility center
Communities help you ask and answer questions, give feedback, and hear from experts with rich knowledge.
Ask the Microsoft Community
Microsoft Tech Community
Windows Insiders
Microsoft 365 Insiders
Was this information helpful?
Thank you for your feedback.
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
What is the Speech service?
- 3 contributors
The Speech service provides speech to text and text to speech capabilities with a Speech resource . You can transcribe speech to text with high accuracy, produce natural-sounding text to speech voices, translate spoken audio, and use speaker recognition during conversations.
Create custom voices, add specific words to your base vocabulary, or build your own models. Run Speech anywhere, in the cloud or at the edge in containers. It's easy to speech enable your applications, tools, and devices with the Speech CLI , Speech SDK , and REST APIs .
Speech is available for many languages , regions , and price points .
Speech scenarios
Common scenarios for speech include:
- Captioning : Learn how to synchronize captions with your input audio, apply profanity filters, get partial results, apply customizations, and identify spoken languages for multilingual scenarios.
- Audio Content Creation : You can use neural voices to make interactions with chatbots and voice assistants more natural and engaging, convert digital texts such as e-books into audiobooks and enhance in-car navigation systems.
- Call Center : Transcribe calls in real-time or process a batch of calls, redact personally identifying information, and extract insights such as sentiment to help with your call center use case.
- Language learning : Provide pronunciation assessment feedback to language learners, support real-time transcription for remote learning conversations, and read aloud teaching materials with neural voices.
- Voice assistants : Create natural, human like conversational interfaces for their applications and experiences. The voice assistant feature provides fast, reliable interaction between a device and an assistant implementation.
Microsoft uses Speech for many scenarios, such as captioning in Teams, dictation in Office 365, and Read Aloud in the Microsoft Edge browser.
Speech capabilities
These sections summarize Speech features with links for more information.
Speech to text
Use speech to text to transcribe audio into text, either in real-time or asynchronously with batch transcription .
You can try real-time speech to text in Speech Studio without signing up or writing any code.
Convert audio to text from a range of sources, including microphones, audio files, and blob storage. Use speaker diarization to determine who said what and when. Get readable transcripts with automatic formatting and punctuation.
The base model might not be sufficient if the audio contains ambient noise or includes numerous industry and domain-specific jargon. In these cases, you can create and train custom speech models with acoustic, language, and pronunciation data. Custom speech models are private and can offer a competitive advantage.
Real-time speech to text
With real-time speech to text , the audio is transcribed as speech is recognized from a microphone or file. Use real-time speech to text for applications that need to transcribe audio in real-time such as:
- Transcriptions, captions, or subtitles for live meetings
- Diarization
Pronunciation assessment
- Contact center agents assist
- Voice agents
Fast transcription API (Preview)
Fast transcription API is used to transcribe audio files with returning results synchronously and much faster than real-time audio. Use fast transcription in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as:
- Quick audio or video transcription, subtitles, and edit.
- Video translation
Fast transcription API is only available via the speech to text REST API version 2024-05-15-preview.
To get started with fast transcription, see use the fast transcription API (preview) .
Batch transcription
Batch transcription is used to transcribe a large amount of audio in storage. You can point to audio files with a shared access signature (SAS) URI and asynchronously receive transcription results. Use batch transcription for applications that need to transcribe audio in bulk such as:
- Transcriptions, captions, or subtitles for prerecorded audio
- Contact center post-call analytics
Text to speech
With text to speech , you can convert input text into human like synthesized speech. Use neural voices, which are human like voices powered by deep neural networks. Use the Speech Synthesis Markup Language (SSML) to fine-tune the pitch, pronunciation, speaking rate, volume, and more.
- Prebuilt neural voice: Highly natural out-of-the-box voices. Check the prebuilt neural voice samples the Voice Gallery and determine the right voice for your business needs.
- Custom neural voice: Besides the prebuilt neural voices that come out of the box, you can also create a custom neural voice that is recognizable and unique to your brand or product. Custom neural voices are private and can offer a competitive advantage. Check the custom neural voice samples here .
Speech translation
Speech translation enables real-time, multilingual translation of speech to your applications, tools, and devices. Use this feature for speech to speech and speech to text translation.
Language identification
Language identification is used to identify languages spoken in audio when compared against a list of supported languages . Use language identification by itself, with speech to text recognition, or with speech translation.
Speaker recognition
Speaker recognition provides algorithms that verify and identify speakers by their unique voice characteristics. Speaker recognition is used to answer the question, "Who is speaking?".
Pronunciation assessment evaluates speech pronunciation and gives speakers feedback on the accuracy and fluency of spoken audio. With pronunciation assessment, language learners can practice, get instant feedback, and improve their pronunciation so that they can speak and present with confidence.
Intent recognition
Intent recognition : Use speech to text with conversational language understanding to derive user intents from transcribed speech and act on voice commands.
Delivery and presence
You can deploy Azure AI Speech features in the cloud or on-premises.
With containers , you can bring the service closer to your data for compliance, security, or other operational reasons.
Speech service deployment in sovereign clouds is available for some government entities and their partners. For example, the Azure Government cloud is available to US government entities and their partners. Microsoft Azure operated by 21Vianet cloud is available to organizations with a business presence in China. For more information, see sovereign clouds .
Use Speech in your application
The Speech Studio is a set of UI-based tools for building and integrating features from Azure AI Speech service in your applications. You create projects in Speech Studio by using a no-code approach, and then reference those assets in your applications by using the Speech SDK , the Speech CLI , or the REST APIs.
The Speech CLI is a command-line tool for using Speech service without having to write any code. Most features in the Speech SDK are available in the Speech CLI, and some advanced features and customizations are simplified in the Speech CLI.
The Speech SDK exposes many of the Speech service capabilities you can use to develop speech-enabled applications. The Speech SDK is available in many programming languages and across all platforms.
In some cases, you can't or shouldn't use the Speech SDK . In those cases, you can use REST APIs to access the Speech service. For example, use REST APIs for batch transcription and speaker recognition REST APIs.
Get started
We offer quickstarts in many popular programming languages. Each quickstart is designed to teach you basic design patterns and have you running code in less than 10 minutes. See the following list for the quickstart for each feature:
- Speech to text quickstart
- Text to speech quickstart
- Speech translation quickstart
Code samples
Sample code for the Speech service is available on GitHub. These samples cover common scenarios like reading audio from a file or stream, continuous and single-shot recognition, and working with custom models. Use these links to view SDK and REST samples:
- Speech to text, text to speech, and speech translation samples (SDK)
- Batch transcription samples (REST)
- Text to speech samples (REST)
- Voice assistant samples (SDK)
Responsible AI
An AI system includes not only the technology, but also the people who use it, the people who are affected by it, and the environment in which it's deployed. Read the transparency notes to learn about responsible AI use and deployment in your systems.
- Transparency note and use cases
- Characteristics and limitations
- Integration and responsible use
- Data, privacy, and security
Pronunciation Assessment
Custom neural voice.
- Limited access
- Responsible deployment of synthetic speech
- Disclosure of voice talent
- Disclosure of design guidelines
- Disclosure of design patterns
- Code of conduct
Speaker Recognition
- General guidelines
- Get started with speech to text
- Get started with text to speech
Was this page helpful?
Additional resources
Azure AI Speech
A managed service offering industry-leading speech capabilities such as speech-to-text, text-to-speech, speech translation, and speaker recognition.
Quickly develop high-quality voice-enabled apps
Build voice-enabled generative AI apps confidently and quickly with the Azure AI Speech. Transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and use speaker recognition during conversations. Build faster with pre-built and customizable AI models in Azure AI Studio .
Industry-leading quality
Get state-of-the-art speech to text, lifelike text to speech, and award-winning speaker recognition.
Compliant and secure
Your data stays yours—your speech input is not logged during processing.
Customizable voices and models
Create custom voices, add specific words to your base vocabulary, or build your own models.
Flexible deployment
Run Speech anywhere, in the cloud or at the edge in containers.
Convert speech to text
Quickly and accurately transcribe audio in more than 100 languages and variants. Gain customer insights with call center transcription, improve experiences with voice-enabled assistants, capture key discussions in meetings and more.
Give your app a voice
Use text to speech to create apps and services that speak conversationally. Create natural-sounding audio content , improve accessibility with read-aloud functionality, and create custom voice assistants.
Translate speech in real time
Translate audio from more than 30 languages and customize translations for your organization's specific terms—all in your preferred programming language.
Verify and recognize speakers
Confirm a person's identity or recognize who's speaking in a meeting by adding speaker verification and identification to your app.
Activate your assistant or IoT device with a custom keyword
Create a custom keyword for IoT devices and voice-enabled assistants to set your brand apart—making it more personal, personable, and secure.
Add voice commands for hands-free scenarios
Build a touchless, voice-first experience to improve safety and support back-to-work scenarios.
Comprehensive security and compliance, built in
Microsoft invests more than USD1 billion annually on cybersecurity research and development.
We employ more than 3,500 security experts who are dedicated to data security and privacy.
Flexible pricing gives you the power and control you need
Pay for only what you use, with no upfront costs. With Speech, pay as you go based on:
- The number of hours of audio you transcribe or translate for speech to text and speech translation.
- The number of characters you convert to audio for text to speech
- The number of transactions for Speaker Recognition
Get started with an Azure free account
Start free . Get USD200 credit to use within 30 days. While you have your credit, get free amounts of many of our most popular services, plus free amounts of 55+ other services that are always free.
After your credit, move to pay as you go to keep building with the same free services. Pay only if you use more than your free monthly amounts.
Trusted by companies of all sizes
AT&T delights customers with immersive experiences
AT&T is showcasing its 5G network with an immersive experience that allows customers to talk directly to Bugs Bunny.*
*LOONEY TUNES and all related characters and elements © & ™ Warner Bros. Entertainment Inc. (s21)
Progressive brings Flo directly to customers
Progressive used Custom Neural Voice to build a natural-sounding, virtual version of Flo to help customers with everything from getting a free car insurance quote to general insurance questions.
KPMG streamlines call transcription
KPMG uses Speech to Text to transcribe and catalog thousands of calls, reducing compliance costs for its clients by as much as 80 percent.
Motorola helps first responders access vital data
Motorola Solutions helps first responders in the field access vital information with a voice-first virtual assistant.
Speech documentation and resources
Get started with ai speech.
Browse the documentation
Take the Microsoft Learn Speech course
Explore popular developer resources
Checkout our sample code and SDKs
Build speech models quickly with Speech studio Stack Overflow
IMAGES