View Single Post
Old 07-20-2007, 11:59 AM   #4
kraft
Maximum Bitrate
 
Join Date: Aug 2004
Location: at home
Posts: 586
Hardware solution is interesting indeed, it offers straight targetting but it loses scalability while software solution is more flexible.

Sphinx2 is really able to operate voice recognition without you previously train it, in other words, it works out of the box, at least for english and spanish language, this means that it recognizes not the voice wave as many competitors do, it make a real sentence content analysis.
You can find more informations about the Sphinx project at http://cmusphinx.sourceforge.net/

Now there is another axis which can enter in conjunction with voice recognition, it's the lip's reading, it's known under the multimodal recognition. I know that there were some experiments with Sphinx but i saw this one or two years ago, i don't know if some progress has been done, probably yes
The advantage of this technic is that it allows far best recognition because it compares what has been recognised with sound and what has been recognized with lip's reading.
It works with models of mouth's shape through vectors. Camera is watching you and make it's own recognition.
Last thing, i can suggest beginners to try Sphinx2 and perlbox-voice, it can give a good approach.
kraft is offline   Reply With Quote