NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
DiscoverModelsBlueprintsGPUsDocsForums

workstations

  • Run on RTX
  • Run on Spark
  • Run on Station

models

  • Reasoning
  • Vision
  • Visual Design
  • Retrieval
  • Speech
  • Biology
  • Simulation
  • Climate & Weather
  • Safety & Moderation

industries

  • Automotive
  • Financial Services
  • Gaming
  • Healthcare
  • Industrial
  • Robotics

Speech

Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Deploy Models Now with NVIDIA NIM

Optimized inference for the world’s leading models
Free serverless APIs for development
Accelerated by DGX Cloud
Self-Host on your GPU infrastructure
Continuous vulnerability fixes

Automatic Speech Recognition (ASR)

Low Latency NVIDIA Nemotron Speech transcription models for your agentic AI workflows.

NVIDIA
Downloadable

nemotron-asr-streaming

Real-time speech recognition for English
Automatic Speech Recognition
1mo
NVIDIA
Downloadable

parakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages
Automatic Speech Recognition
12mo
NVIDIA
Downloadable

parakeet-ctc-0.6b-zh-tw

Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.
ASR
6mo
NVIDIA
Downloadable

parakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps
ASR
9mo
NVIDIA
Downloadable

parakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.
ASR
10mo
NVIDIA
Downloadable

parakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.
ASR
10mo
NVIDIA
Downloadable

canary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.
Automatic Speech Recognition
1y
OpenAI
Downloadable

whisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.
ASR
1y
NVIDIA
Downloadable

parakeet-ctc-0.6b-vi

Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.
ASR
7mo
NVIDIA
Downloadable

parakeet-ctc-0.6b-zh-cn

Record-setting accuracy and performance for Mandarin English transcriptions.
ASR
7mo
NVIDIA
Downloadable

parakeet-ctc-0.6b-es

Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.
ASR
7mo

Speech to Speech (S2S)

Ultra-low latency, end-to-end, full duplex models for real-time voice-to-voice interactions.

NVIDIA
Free Endpoint

nemotron-voicechat

Nemotron 3 Voicechat
English
1mo

Convert Text to Speech (TTS)

Convert written text to spoken audio in multiple languages with NVIDIA Nemotron Speech models.

NVIDIA
Downloadable

magpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.
NVIDIA NIM
10mo
NVIDIA
Free Endpoint

magpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.
NVIDIA NIM
10mo
NVIDIA
DeprecatedFree Endpoint

magpie-tts-flow

Expressive and engaging text-to-speech, generated from a short audio sample.
NVIDIA NIM
9mo

Neural Machine Translation (NMT) & Audio Speech Translation (AST)

Enable seamless multilingual global communication across dozens of languages with NVIDIA Nemotron Speech models.

NVIDIA
Free Endpoint

riva-translate-4b-instruct-v1_1

Translation model in 12 languages with few-shots example prompts capability.
neural machine translation
4mo
NVIDIA
Downloadable

riva-translate-1.6b

Enable smooth global interactions in 36 languages.
NVIDIA NIM
10mo
NVIDIA
Downloadable

canary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.
Automatic Speech Recognition
1y
OpenAI
Downloadable

whisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.
ASR
1y