If you need a speech-to-text service that can handle multiple languages, AssemblyAI is worth a look. It can transcribe speech in more than 99 languages, trained on 12.5 million hours of multilingual audio data. Among its features are low-latency streaming speech-to-text, speaker diarization and support for a range of integration tools for programmers. Pricing is tiered, including a free tier for prototyping, so it's a good choice for AI product makers.
SpeechText is another good option, with support for more than 30 languages and a focus on high accuracy. It uses sophisticated deep neural network models to recognize speech well, even with non-native speaker accents. The service has a range of pricing levels and an API for use in apps, so it should be adaptable to journalism, medicine, business and other domains.
If you're looking for something a bit more economical, Vocaldo can transcribe speech quickly and accurately in more than 100 languages. It can also generate automatic summaries, translate text and export files in a variety of formats, which could be useful for content creators and businesses trying to reach a broader audience. Vocaldo also has security and confidentiality options with strong data protection.
Last, Gladia has a powerful transcription API that works with 99 languages, including code-switching and word-level timestamps. Its end-to-end encryption protects data and helps companies comply with privacy regulations. The API is designed to work with a variety of tech stacks, so it's good for content, media and workspace collaboration.