Anyone who's ever transcribed an audio interview into text knows what a painfully slow process that is. But with the new Otter app, created by a company called AISense, this could become a thing of the past, even when transcribing a complex conversation with several people speaking.
The app, which I tried out at Mobile World Congress in Barcelona, is simple to use: Start it up, and it'll start turning the conversation around it into text. After a quick setup process, it knows when you are speaking, and it can distinguish between different voices in the conversation. You can also search it for certain terms that were mentioned in the conversation.
Otter uses AI smarts for automatic speech recognition, speaker separation and identification as well as deep content search — features not commonly seen on similar software solutions. And it actually does its job quite well in a reasonably quiet room — errors are there, obviously, but having an automated transcript right after you're done with an interview is very useful for journalists (you also get the audio recording of the conversation, don't worry).
In the extremely busy environment of Mobile World Congress, however, Otter's transcript was nowhere near the quality level that would make it usable — especially when recording my voice (it did better with the voices it was trained for). This is unsurprising: With dozens of people talking around you and a voice coming out of the speakerphone drowning out your conversation, I don't think any transcription engine would do a very job. In other words: Do not expect wonders in a noisy environment.
The app is free, but a paid version is currently not a priority and is expected in the second half of this year. "We want to encourage usage. We want to get a lot of users (...) We want to grow a healthy user base. (They) give us a lot of feedback, give us a lot of good training data so that we can improve our product and the end user experience," Simon Lau, head of product at AI Sense, told me.
In case you were wondering, I didn't write those sentences down; they come directly from Otter which was recording our conversation in the chaos of a hallway of Fira Gran Via, the Barcelona venue where Mobile World Congress is held. Check out that entire part of the conversation in the image, below.
Otter isn't just for journalists, though. Sam Liang, the CEO and co-founder of AI Sense, sees numerous potential uses, both for businesses and individuals. This includes recording meetings, teleconferencing sessions, sales calls and the like.
An obvious concern is privacy. Lau claims all the data is stored and moved around securely, with no one except the owner having access to it. As for potential abuse — such as eavesdropping on your employees, for example — Lau says the company will make efforts to educate its users on acceptable and unacceptable use.