Announcement

Collapse
No announcement yet.

Voice recognition using SAPI

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Voice recognition using SAPI

    I'm creating a new thread so I won't clutter 0l33l's "Voice activated navigation for routis!" thread.

    I wrote my own voice recognition app using SAPI. It's generic and can be used in any application that accepts keyboard input. Features and the grammar XML can be found on my website: http://galant.circledplus.com/carpc/carpc_apps_vr.html I'll create a download link later tonight in case somebody's interested to try it out.

    I have a few questions for NTurkey...

    Is it possible to create adhoc attributes? Like <P myAttribute="fooBar" VALSTR="taskmgr.exe">task manager</P> I'm getting an error if I create an attribute it doesn't recognize. This is the reason why I just used the inner PROPNAME in the example below.
    Code:
      <RULE NAME="runApp" ID="RID_RunApp" TOPLEVEL="ACTIVE">
        <P>open</P>
        <L PROPNAME="runAppvalue">
          <P PROPNAME="task manager" VALSTR="taskmgr.exe">task manager</P>
          <P PROPNAME="i Guidance" VALSTR="iGuidance">navigation</P>
          <P PROPNAME="Media Car" VALSTR="C:\Program Files\MediaCar\MEDIACAR.exe">media car</P>
        </L>
        <O>...</O>
      </RULE>
    Also, is there a way to get the text at a particular <P> node? Like I want to get "task manager" (between the P tags, not the PROPNAME) from the above code.

    TIA

    00 Galant
    armadaE500 P3-660 320M 20G, lilliput, audigy2NX, slim/slotLoad dvd/cdrw, cardReader
    sony rm-x2s, bu303, xmDirect
    xpPro sp2, frodoPlayer 1.09, iGuidance 2.0, custom voiceRecognition
    custom shutdownController

  • #2
    Want to try
    What language?
    :edit: What programming language?

    Installation: 90% complete - fiberglassing
    EPIA M10000 - 512Mb - 20GB
    Lilliput 7" TS - Opus 150W PCB - DLink USB Radio - slim CD-ROM - SoundBlaster MP3+ - not so crappy 40x4 Amp - BU303 GPS (waiting for) - BT support

    Comment


    • #3
      Originally posted by BeamRider
      Want to try
      What language?
      :edit: What programming language?
      VB.Net
      Co Develper of A.I.M.E.E Automotive Intelligent Multimedia Entertainment Engine
      www.aimee.cc

      Comment


      • #4
        Originally posted by djScript
        I have a few questions for NTurkey...

        Is it possible to create adhoc attributes? Like <P myAttribute="fooBar" VALSTR="taskmgr.exe">task manager</P> I'm getting an error if I create an attribute it doesn't recognize.
        As far as an arbitrary attribute for the P element in SAPI's XML, no, you can't create arbitrary XML attributes. The SAPI compiler will complain.

        In your example above, are you just trying to keep track of the application name, as well? If so, you could use something like this (you'll find two properties in the result, instead of just one ... one called AppToLaunch and one called AppName):

        Code:
           <RULE NAME="GlobalApps" TOPLEVEL="ACTIVE">
              <L>
                 <P PROPNAME="AppToLaunch" VALSTR="taskmgr.exe">
                    <P PROPNAME="AppName" VALSTR="Task Manager">
                       Task Manager
                    </P>
                 </P>
                 <P PROPNAME="AppToLaunch" VALSTR="C:\Program Files\MediaCar\MEDIACAR.exe">
                    <P PROPNAME="AppName" VALSTR="Media Car">
                       Media Car
                    </P>
                 </P>
              </L>            
           </RULE>
        Originally posted by djScript
        Also, is there a way to get the text at a particular <P> node? Like I want to get "task manager" (between the P tags, not the PROPNAME) from the above code.
        What language are you using? If you're using C++, you can get the ISpPhrase interface from the result, then you can call GetText.

        To determine what the appropriate start and count would be, once you have the SPPHRASEPROPERTY, you can look at ulFirstElement and ulCountOfElements. You might have to change your grammar around a little bit, but not much.

        Does that make any sense? I bet it doesn't if you're not using C++.
        2004 BMW 330Ci

        Audio: Alpine PXA-H701, XTANT 1.1i, PPI 4800, MB Quart QSD 216, JL W6v2
        Computer: Shuttle XPC P4 3GHz HT, 1G, 160GB HD, WinTV
        Software: StreetDeck ... soon with wicked cool speech integration ...

        Install (A/C/S): 100/100/90 %

        Comment


        • #5
          Originally posted by NTurkey
          In your example above, are you just trying to keep track of the application name, as well?
          Yes, which is basically the same as the command that's why I wanted to get the text between the <P> tags. GetText(FirstElement, NumberOfElements) seems to have did the trick and I didn't have to change the grammar .

          Originally posted by NTurkey
          What language are you using?
          I'm using VB.net

          Thanks.

          00 Galant
          armadaE500 P3-660 320M 20G, lilliput, audigy2NX, slim/slotLoad dvd/cdrw, cardReader
          sony rm-x2s, bu303, xmDirect
          xpPro sp2, frodoPlayer 1.09, iGuidance 2.0, custom voiceRecognition
          custom shutdownController

          Comment


          • #6
            The app can be downloaded from my website:
            http://galant.circledplus.com/carpc/carpc_apps_vr.html

            Just post here if you need help in grammar configuration.

            00 Galant
            armadaE500 P3-660 320M 20G, lilliput, audigy2NX, slim/slotLoad dvd/cdrw, cardReader
            sony rm-x2s, bu303, xmDirect
            xpPro sp2, frodoPlayer 1.09, iGuidance 2.0, custom voiceRecognition
            custom shutdownController

            Comment


            • #7
              Download link on website doesn't work for me.

              Also, I'm moving this reply over to here from other thread:

              Originally posted by djScript
              I wrote a voice recognition app that can be fully customized through the xml file. I'm using this app to voicecommand mediacar by enabling winamp's global hotkeys. Even if mediacar is not the active window, I could still control the audio section.

              Check it out on my website --> http://galant.circledplus.com/carpc/carpc_apps.html

              Let me know if you're interested and I'll create a new thread. I don't want to hijack 0l33l's thread.
              Looks good. I saw that you're using MediaCar and thought I'd mention an idea I had for navigating between the audio, navigation, dvd, ect... portions since it didn't appear to me that keystrokes were supported. I just wrote tiny programs that send a single mouse click to the button of the area I want to navigate to. I'm sure somebody could write a program that would take a skin file and parse the information into lines ready to be coded into a program. Once again nice job!
              '03 Sierra Denali

              Comment


              • #8
                Originally posted by rgardjr
                Download link on website doesn't work for me.
                Does it give you any error? 404? I tested it and it worked for me.

                Try this:
                VoiceRecognition

                Let me know if that still won't work and I'll just attach the file here.

                Originally posted by rgardjr
                Looks good. I saw that you're using MediaCar and thought I'd mention an idea I had for navigating between the audio, navigation, dvd, ect... portions since it didn't appear to me that keystrokes were supported. I just wrote tiny programs that send a single mouse click to the button of the area I want to navigate to. I'm sure somebody could write a program that would take a skin file and parse the information into lines ready to be coded into a program. Once again nice job!
                I updated the grammar and created a new rule to support sending mouse clicks. It will switch to the application window you want to control first before it moves the mouse and sends a click. I have some few commands for MediaCar's main menu buttons.

                00 Galant
                armadaE500 P3-660 320M 20G, lilliput, audigy2NX, slim/slotLoad dvd/cdrw, cardReader
                sony rm-x2s, bu303, xmDirect
                xpPro sp2, frodoPlayer 1.09, iGuidance 2.0, custom voiceRecognition
                custom shutdownController

                Comment


                • #9
                  Originally posted by djScript
                  Does it give you any error? 404? I tested it and it worked for me.

                  Try this:
                  VoiceRecognition

                  Let me know if that still won't work and I'll just attach the file here.
                  The above link worked great.

                  Originally posted by djScript
                  I updated the grammar and created a new rule to support sending mouse clicks. It will switch to the application window you want to control first before it moves the mouse and sends a click. I have some few commands for MediaCar's main menu buttons.
                  Wow! This is great news. Can't wait to get this in my car. What microphone are you using in your car and how good of job does it do?
                  '03 Sierra Denali

                  Comment


                  • #10
                    Originally posted by rgardjr
                    Wow! This is great news. Can't wait to get this in my car. What microphone are you using in your car and how good of job does it do?
                    I'm using a regular cheap-o mic :


                    I have to speak a bit louder if my windows are down or music is playing. But I can deal with that. I have a sony remote on the streering wheel column, so I hit the pause button first before I activate voiceRecognition.

                    00 Galant
                    armadaE500 P3-660 320M 20G, lilliput, audigy2NX, slim/slotLoad dvd/cdrw, cardReader
                    sony rm-x2s, bu303, xmDirect
                    xpPro sp2, frodoPlayer 1.09, iGuidance 2.0, custom voiceRecognition
                    custom shutdownController

                    Comment


                    • #11
                      suggestion:

                      DJscript, I read your article and its a great things you've done with SAPI. I'm curious as to if your using custom designed software (mdae by you) to play the music, if so make it turn the volume down when your giving the system a command.
                      Progress [I will seriously never be done!]
                      Via EPIA MII
                      512MB RAM
                      OEM GPS (embedded)
                      nLite WinXP pro on
                      1GB Extreme III CF card
                      Carnetix 1260 startup/ DC-DC regulator
                      Software: Still, re-Writing my existing front end in .Net

                      Comment


                      • #12
                        Another suggstion: I find that getting the alternates gives better results. Instead of ISpRecoResult::GetText() using ISpRecoResult::GetAlternates( ) and GetText( ) on those. The first one is the best match, and it does extra work like converting numbers to digits.

                        Comment


                        • #13
                          Originally posted by IntellaWorks
                          DJscript, I read your article and its a great things you've done with SAPI. I'm curious as to if your using custom designed software (mdae by you) to play the music, if so make it turn the volume down when your giving the system a command.
                          I'm using MediaCar. I've thought about turning the volume down, but I kinda lean towards the push-to-talk type setup. On my friend's Accord, you have to push a button on the steering wheel to activate voice commands.

                          00 Galant
                          armadaE500 P3-660 320M 20G, lilliput, audigy2NX, slim/slotLoad dvd/cdrw, cardReader
                          sony rm-x2s, bu303, xmDirect
                          xpPro sp2, frodoPlayer 1.09, iGuidance 2.0, custom voiceRecognition
                          custom shutdownController

                          Comment


                          • #14
                            Originally posted by Curiosity
                            Another suggstion: I find that getting the alternates gives better results. Instead of ISpRecoResult::GetText() using ISpRecoResult::GetAlternates( ) and GetText( ) on those. The first one is the best match, and it does extra work like converting numbers to digits.
                            I'm not really using GetText() to process the commands. In the Recognition event, I simply act on the VALSTR depending on the rule name. Something like:
                            Code:
                            Select Case RuleName
                               Case "talk"
                                  RC.Voice.Speak(ValStr)
                               Case "sendKeys"
                                  SendKeys.Send(ValStr)
                            End Select
                            I wish there was something in the ISpRecoResult that returns the accuracy of the recognized command. So I would only process the command if it's like at least 80% accurate. I tried using CONFIDENCE but it's not very granular and when I was testing it, most incorrectly recognized commands have a Normal confidence so it's useless.

                            NTurkey, is there any way of doing this?

                            00 Galant
                            armadaE500 P3-660 320M 20G, lilliput, audigy2NX, slim/slotLoad dvd/cdrw, cardReader
                            sony rm-x2s, bu303, xmDirect
                            xpPro sp2, frodoPlayer 1.09, iGuidance 2.0, custom voiceRecognition
                            custom shutdownController

                            Comment


                            • #15
                              Voicerecognition.exe - Application Error

                              The app failed to init properly (0xc000135)

                              any ideas?

                              Comment

                              Working...
                              X