Simultaneous Audio playback & Speech recognition

edited May 2016

I've started to work on the applicatin which will play podcasts. While listening I want to make notes, mark certain places for futher using. I never listen podcasts when I have free time, but mostly in a car on a way to work/home. That's why I want to control podcasts playback and notes talking to iPhone: PAUSE, PLAY, MARK PREVIOUS THOUGHT/SENTENCE, MARK CURRENT THOUGHT/SENTENCE, PLAY NEXT, SKIP THIS...
I have podcasts original text and using pocketsphinx I tagged it with timestamps. Now I only need offline live speech recongnition during playback.

I started from "OpenEars" which uses pocketsphinx and just provides simple iOS SDK, but from their forum: "It’s necessary to suspend recognition during audio playback. Unfortunately, speech recognition isn’t going to be able to take in two audio sources of information simultaneously and give good results."
Actually simultaneous playback and speach recognition works, but only when I try out headset. But when it plays Podcast to iPhone speaker (iPhone mic as input) or to car's audio jacks (bluetooth call as input) it doesn't work completely: speach recognition engine starts recongnize my playback.
I play podcast with AVAudioPlayer, OpenEars sources unavailable.
I tried to record without speech recognition with AVAudioRecorder, and it contains both: my voice and podcast.

I want to dig in pocketsphinx myself and create iOS API for it, but first of all I need to resolve "simple" task: start playing audio to iPhone speaker and record audio the same time from mic but without audio in playback, just my voice.
Could you please tell me if it is possible at all and give clear direction, if possible with certain examples.

Thanks in advance.


  • Take a look at the VoiceProcessing IO audio unit (available in TAAE1 with the voiceProcessingEnabled flag). It's not perfect, but it can work okay in certain circumstances.

