Nvidia Unleashes Parakeet-TDT: Open-Source ASR Model Setting New Standards
Nvidia has launched Parakeet-TDT-0.6B-v2, a groundbreaking open-source automatic speech recognition model that boasts a word error rate of just 6.05%. Available under a Creative Commons license on Hugging Face, this model is not only commercially viable but also capable of transcribing an hour-long audio in a mere second using Nvidia GPUs. Trained on the 120,000-hour Granary dataset, Parakeet-TDT supports advanced features like punctuation, capitalization, and word-level timestamps, making it a versatile tool for diverse applications.
May 6