Now it has happened – AI is better than humans at perceiving and translating the spoken language into writing.
Until now, humans have been half a percent better than AI at translating the spoken language into writing. Not anymore – an AI solution developed at the Karlsruhe Institute of Technology ( KIT ) has reached an accuracy of ninety-five percent .
A few years ago, AI had an error rate of fourteen, and we can probably expect further improvements from AI.
The task is extra challenging because people pronounce words in different ways. We also fill our speech with a number of sounds that are difficult to decipher – we hum and gossip, stop and take over, perhaps from a different angle. Alex Waibel , professor of computer science and one of the people responsible for the study at KIT, reminds that unclear speech even makes it difficult for people to understand what is being said. For AI, it has been a great challenge, but practice has given skill . (You can read more about speech recognition here .)
KIT started already in 2012 by offering direct translation of the university’s lectures from German and English into the languages ββspoken by guest students. It is called a lecture translator , and means that students can follow a lecture live. The delay, or latency as KIT calls it, is now one second.
Fault frequency and delay are measured according to an internationally recognized scale called the US NIST (or switchboard benchmark). KIT is well placed on the scale with its solution. Through machine learning, it will be even more accurate and deliver even faster, which opens up a number of applications in new areas, including outside the university world.
We sometimes say “can you wait a second?” The term just took on a new meaning.
Make the future come sooner!