    Release of Speak Easy Voice Recognition version 0.9.1

    I am releasing a new version of my voice recognition software. Thanks to those brave souls that tested the alpha relase a few weeks ago, it really made me fix all those bugs! IT now supports: RR, FP, MM, FD and any app you wish!

    THis is a new release which has been tested as much as I can but it's still a preview release so don't expect 100% perfect just yet.

    This release has a lot of changes:
    1 Comprehensive error handling and error logging.
    2 New Vocabulary window to make new phrase commands and change existing ones
    3 A settings window to customise how SpeakEasy works.
    4 A plugin section in the settings window to allow you to add any application as a plugin so you can control lots of apps.
    Complete support for MM, RR, FP, FD with new listing of possible commands for each app.

    Here's the download link, it's a zip no installer just yet and I have not optimsed the compiler yet.

    The next post will give you the readme/help to get started.

  • #2
    ReadMe/Help (sorry for some reason the images are appearing as links, somethign is up with the board editor?

    This is the start screen, it's the small white box on the top left hand side. It has 3 buttons, press the SpeakEasy text and the main window opens and voice recognition starts. press the info logo and it opesn the settings screen. Press the X and it closes the application.

    The main SpeakEasy screen.

    The Buttons on the top:
    Phrases - opens the vocabulary editor window, see below.
    Settings - opens the configuration/settigns screen see below
    Reset - reset the recogniser engine and clears all group locks or disabled groups
    Pause - Stops the recogniter engine so that you can keep the window open without executing commands
    Mute - Mutes the audio volume on the PC
    Close - CLoses the current SepakEasy window

    The panels in the screen help you when you talk tp the PC. The text in red on the top left is what it has heard so far, as you talk it will add more to this phrase until a command is recognised or it timeout.

    The panel on the left in the middle is the command that it recognises. Each time SpeakEasy recognises a phrase it gives it a accuracy score, something like 75%. You can use the sliding bar below this text to set your default acceptance threshold, phrases that score above this threshold are executed and you'll see them in green text, scores below this level are in red text and are not executed. If you play with the threshold settings you'll find what works best when driving. You can change it for words that are harder to recognise and the set it back. Also if a phrase falls below the threshold and is not executed (in red text) then you can just click the text in red and the command will be executed even if below the threshold. The idea of the threshold is to stop false commands being executed because of background noise like engine revs or MP3 playback.

    The panel on the bottom left is a help telling you the current status of commands or if you are talkign to loudly or slowly.

    The panel on the reight is a list of commands than can be said for the phrase you spoken so far. It is kind of like an auto-complete to tell you what's available in the phrase vocabulary. It is also a quick way to learn/remember the phrases.

    Finally if you set the Activation timeout to anything above 0, then the countdown clock will be shown in the bottom of the screen timing down to zero, once it hits zero the window will automatically close. Useful if you are driving and want to speak a few commands and then deactivate SpeakEasy recogniser. If there is a timer on the main window then you can click the counter at any time to pause the timer so timed automatic close is haulted, you can press it another time to toggle the timeout counter.

    The phrase editor allow you to simple add/update/delete phrases from your *.ini vocab files. You can have as many *.ini files as you like as long as they are in one folder. Where that folder is can be set in the Settings screen. When you open the settigns screen the default.ini file is shown. I've created a few commands in it to illustrate what is possible. A voice command is made up of a group and a phrase. You can create new groups by typing text intot he puppdown list on the right hand panel in the group box and complete a new phrase and press the Add Phrase button(or you can select an existing group from the list).

    The list on the top left list all the group/phrases in the current file. The "global" group is different to all the others, when you create commands in the global group you only speak the phrase not the group word, for eaxmple "group juke box" means that if the user says "juke box" then the command is executed. If the command is something like "playlist folder down" in the playlist grouping then the user has to say the whole command "playlist folder down" for it to be executed.

    A Voice command is made up of 4 things:
    Group - the first word to say (excludes global grouping)
    Phrase - the remainder of the words to say
    Help Text - text that will appear in the status panel in the main window if you click on the Help list panel when speaking words
    Macro - A list of commmands to be executed by SpeakEasy

    A Macro is one or more commands that look like RR:STOP or MM:ADDRESS. You add a command to a macro by selecting the application you want to talk to (deafult apps include: RR, MM, FP, FD, SE, you can add more apps by adding their details tot he settings window, more later). After selecting somethign like RR from the list, a help list appears ont he bottom left side of the window with a list of possible command (and some helpful description in square brackets). If you select the correct command it is added to the COMMAND text box, you then need to press the "Add Command" button to add it to the current Macro. You can use the "Remove Command" to delete any command from the Macro list.

    Once you have entered the items below then press the "Add Phrase" button to add it to the current vocabulary file.
    1. Group
    2. Phrase
    3. Help text (optional)
    4. Macro list

    You can remove a single phrase from the vocab file by pressing the "Delete Phrase" button, you can also delete a whole grouping of commands by pressing the "Delete Group" button.

    Thre are also buttons to open other vocab files or create completely new ones.

    The settings window allows you to customise SpeakEasy the way you want it.

    Activation Phrase - WHen SpeakEasy starts (with the 3 buttons at the top left of the monitor) open opening is to activate the main window by saying a few words. Kind of like the "BORIS" phrase in NaviVoice. If you leqave this text blank in this screen then the is no voice activation, you'll need to press the SpeakEasy text button.

    Vocabulary Folder - is where the *.ini vocab files are located. By default in the vocab folder under the install folder.

    There are 3 sliders for setting defaults for:
    Activation Timeout - the number of seconds the mian SpeakEasy window stays open listening for commands, if you set this to zero then there is no timeout you'll need to press close to close the main window.

    Accuracy Threshold - The default threshold % for the mian window, I normally set this to about 5 or 10 % which stops false commands because of background noise. You can set this threshold from 0% to 50%. O% means that any command that is heard will be executed. The higher the threshold value the more precise you need to speak.

    Transparency - THis allow you to make the mian speakEays window as transparent as you like. Will testing I'd keep this near 100%. Once you know you're vocabulary and don't need the main window buttons so much make the transparency higher to fade it into the colors behind.

    Finally this screen ahs a listing of the applications it can talk to. Each Application has 5 items:

    1. name - A prefix like RR or MM, you can decide the prefixes you want. These are the use in each command in the Macros like MM::Address

    2. method - The way to talk to the plugin application, settings include
    COM - sendMessage
    TCP - Internet packet requests
    OPEN - To start an application or open a file.
    SELF - Is just used by SepakEasy for internal commands, like reset or disable groupings.
    KEY - To simulate keyboard keystrokes liek CTRL-ALT-DEL or typing OSK letters
    MM - To use MapMonkey specific interface

    3. windowName - The name of the applciation you want to talk to. Like "Map Monkey [GPS]"

    4. className - the underlying code name, use a spying programm to get the className, it is not 100% necessary but it is good to have. PM me if you want to know the windowName and className of any applications you want to use.

    5. port - is use just when you use the TCP method you need to specify the port number of the application like 4413 for MapMonkey TCP interface if you use MM that way.


    • #3

      Microsoft .Net runtime 1.1 or higher
      Microsoft SAPI 5.1

      You'll need to train the SAPI engine with your voice/accent which is in the control panel under Speech icon. The training takes about 5 minutes.


      • #4
        link doesn't work for me just hangs.
        • #5
          downloaded the new version.. I will be tonight adding the rest of the FD commands to it, and also changing it and remove commands I dont need.. I've just spent the last 5 mins, reads bolloxs training the speech thingy from the control panel.. Wife walked in and started laughing.. anyway, When I start mine up I get the form background as black, where in your screenshots its white.. just wondered thats all :-) will have a good play and let you know what I think.. As a idea, would it be possible to set the place where the white box appears.. rather than have top right, could it be possible to have it anywhere on the screen, could be handy to make it then look like part of the skin underneth. no worrys if its to much.. keep it up and will have a full report once I have had a good shout at it :-)

          • #6
            Good idea on location stuff, I'll add a location setting for the initial top left window and the main window. I'll try to add the location settings over the weekend. For the moment you can blend the top windows into the background colour by setting the transparency lower, I set it to 50% in my version, click the settings button and move the 3rd slider.

            Next minor release will have mouse clicking support if the plugin application does not have an standard COM/TCP/Key API.

            I will investigate the black background, I don't know what's happening there. I'll look into it. Maybe because the ZIP file is DEBUG build rather than a RELEASE build in VS.Net.


            • #7
              Download link does not work for me either.
              • #8
                My Server is down! Looks like a major problem with the HD, can't get anything. I'll post a message when it is live again.


                • #9
                  hope you dont mind... I have uploaded the version I downloaded to my server. Let me know when your back up or if you want me to remove it



                  • #10
                    been playing with this for the last few hours.. got it so it now inputs a address for me and deletes history etc.. looking good.. Is it possible to request a few things ?

                    unsure if this is already done, but could you add a single command that enables it to listern, then after 30 secs or something of time without any commands it would go deaf again, just listerning out for this one command.. something like "computer start listerning".. this way it would save having to tap the screen. Also again would it be possible say for it not to show the settings box at all ? Maybe if it repeated what you said to confirm it heard, or even just a bleep.. high for understood, low for not understood. thats about it for now.. once I have tested the settings file i've done.. I will post it for all to use.

                    great work m8 :-)

                    o while i'm posting.. does anyone have a direct link to sapi 5.1, or is it the sdk that I need.. installed on here no probs, just need to stick it on my carpc yet

                    • #11
                      good ideas and thanks for the link! i will install now.

                      Here is a sapi link

                      • #12
                        Cheers CDR for hosting it. My server is back up again.

                        The good news is that what you want is possible by using the settings screen.

                        There is a command to activate without needing to press a button, go to the settings screen and put the words you want in the "Activation Phrase" box, so put your phrase "computer start listerning" there (without the quotes).

                        The timeout of 30 seconds can be done by setting the "Activation Timeout" slider in the settings screen to 30.

                        Hiding the main screen can be done by setting the transparency to 0%, this will not hide the top left window because you may need to access the settings screen to unhide it later. The one thing that is not perfect is that if the main window is hidden there is no status in the top left window to say it is listening or disabled or whatever. I like the idea of different beeps, I'll add that to the settings. Perhaps I'll also make the SpeakEasy text a different color as well:

                        Green if it is listening and recognising phrases
                        Yellow if it is listening but getting partial phrases
                        Red if it is not listening.

                        What do you think of this?

                        Do you want to hide the top left window as well? If so, I'll code this as well, my only question is how to unhide everthing if all of the windows are hidden, perhaps a voice command like "SpeakEasy Show" ; "SpeakEasy Hide" ; "SpeakEasy Settings" and "SpeakEasy Exit".