Outliner Software Forum RSS Feed Forum Posts Feed

Subscribe by Email

CRIMP Defined

 

Tip Jar

voice memos speech recognition / transcription

View this topic | Back to topic list

Posted by jimspoon
Jan 6, 2015 at 01:29 PM

 

Thanks all for your input - I will investigate your suggestions.

What surprises me is that the Google and Nuance recognizers produce such good results when entering text using voice in real time on my Android phone. So it seems like it should be possible to get equally good results by submitting previously recorded sound files to the same servers/software. I mean quick voice recordings that would be made without even looking at the device, simply by holding down a button or something like that.

There ought to be a dedicated voice recorder device out there that works like that - grab it, hold down button, wait for beep, talk, release button. Device automatically uploads recorded audio to servers via 3G/4G or wifi if available. Text together with recorded sound is available via smartphone app or web app - instantly searchable. For correcting recognition errors, the interface would need to have something like the desktop Dragon’s “play this back” function - highlight text, click “Play this Back”, and hear the portion of the audio recording which the recognizer transcribed into the selected text.  The data plan for such a device shouldn’t be that expensive, after all - it wouldn’t take that many megabytes to transmit voice recordings to the servers. I think such a device would be a killer product. I’ve looked at the websites for various voice recorder manufacturers (Olympus, Sony), haven’t found anything like it. Some high end recorders have some wjfi capability.

I did find an interesting android app called Speech to Text Notepad - for me the interesting feature was the ability to delete words by saying, for example, “delete 4” to delete the four words preceding the cursor. This makes it easy to delete mis-recognized words.  Just tap at the appropriate location before issuing the command. 

https://play.google.com/store/apps/details?id=com.heterioun.HandsFreeNotes

The speech recognition built into the Google Keyboard and Swype keyboard doesn’t recognize many commands - while they properly interpret “new line”, “comma”, “period”, “exclamation point” “question mark” - “backspace” and “delete” do not work as they should. They should also recognize and respond appropriately to “insert date” and “insert time” (in configurable formats). 

Seen anything like this, for android or iOS?