Google is allowing third-party developers to access its speech recognition technology with the help of its Cloud Speech API, which has become generally available.
Cloud Speech, which allows developers to convert audio to text with a simple to use API, was introduced in open beta in summer 2016, and is built on the core technology that powers speech recognition for Google products, such as Google Assistant, Google Search, Google Now.
Google bettered the transcription accuracy for long-form audio and process data 3x faster than the initial version, thanks to the it received feedback from its customers and partners. In addition, more audio file formats are supported, including WAV, OPUS, and Speex. With context-aware recognition that orients listening according to the scenario, Google notes that early adopters have primarily used the API to control apps and devices with voice search, commands, and Interactive Voice Response (IVR). Cloud Speech can operate on a wide range of IoT devices, such as cars, TVs, speakers, phones and PCs.
The second frequent use case is with speech analytics that allows for “real-time insights from call centres.” Some businesses have used this in particular to monitor customer interactions and increase sales. For instance, Houston, Texas based InteractiveTel is using Cloud Speech API in solutions that track, monitor and report on dealer-customer interactions by telephone.
Pricing details for the Cloud Speech API are available on the Google Cloud Platform site.
Rojenx is a leading concept artist who work appears in games and publications
Check out his personal gallery here