Page 1 of 1

Cheap voice recognition integration

Posted: Thu Mar 07, 2013 8:57 pm
by airox
Hi guys,

I have done a lot on my own software in the past few months. But no time to talk about on this forum unfortunately. One thing I do wish to get in your attention is the new web speech api which is in Google Chrome version 25.

See this for the example:
http://updates.html5rocks.com/2013/01/V ... Speech-API

What I did with it? My touchscreen on the wall is running google chrome in fullscreen mode showing the home and all its controls. The central panel to control audio, video, light, etc. I have a two-way communication with the use of websockets to my home automation service. I integrated voice recognition a year ago, but this was build in a way that it would be listening all of the time for voice commands. The touchscreen however is in the livingroom where the television caused many false positives. So I decided voice recognition is not the way to go. Well... not in this manner. One example scenario which I've build and am "user-testing" currently, but could be nice for people over here:

- Five minutes after I turn off the alarm when I get home the system asks through a text-to-speech service if I want to a) turn on the television, b) turn on music, c) tell the News which happend in our town in the absence.
- The system send a message through a websocket towards the touchscreen (which is turned on at the moment I turn off the alarm with wakeonlan) to get speech input.
- The system checks if any media devices are on, if so mutes them for five seconds.
- The touchscreen turns on with javascript the web speech api input for five seconds.
- I can then say "television", "music" or "News".
- The touchscreen sends the commands received towards the server.
- The system checks which command was sent and acts accordingly.
- I feel happy ;-)

I also created questions in the early morning on working days if the system should turn on the RTL4 News. Also on finished movie downloads I ask if the room should be switched to movie mode and the movie started. When the movie is finished it asks if the room is supposed to be turned back to normal.

I hope this inspires again to use this api in your systems :-)

Greetings!

Gijs

Re: Cheap voice recognition integration

Posted: Thu Mar 07, 2013 11:54 pm
by Herbus
Nice! Would be great to see it work in a little video :-)

Re: Cheap voice recognition integration

Posted: Fri Mar 08, 2013 12:43 am
by Digit
Thanks for the info, this sounds like something I'll have a look at in the very near future! 8)

Re: Cheap voice recognition integration

Posted: Fri Mar 08, 2013 1:57 am
by airox
I will try to create a video tomorrow!

Re: Cheap voice recognition integration

Posted: Fri Mar 08, 2013 9:45 am
by Alexander
Very nice, but i'm curious if the first bullet will not get frustrated to wait for a defined period of time.

Re: Cheap voice recognition integration

Posted: Fri Mar 08, 2013 12:14 pm
by AshaiRey
Thanks for sharing. I am trying/thinking/thinkering a few years already with speech recognition with various results. Till now none of them was approved by my wife. So my hope will on this one to see how it hold up.

Re: Cheap voice recognition integration

Posted: Fri Mar 08, 2013 12:24 pm
by Alexander
difficult wife than, order replacement? ;-)

Re: Cheap voice recognition integration

Posted: Fri Mar 08, 2013 1:18 pm
by rbmace2403
Great to see you already implemented it, I had this on my list as well for voice control HTML5 - Chrome. Love to see the video :-)

Re: Cheap voice recognition integration

Posted: Fri Mar 08, 2013 1:24 pm
by airox


Here is the video! Some stuff has bad focus, couldn't get it right with the camera. But it shows the concept.
Something which I defined internally is the concept of conversations which assist me over here. It has a trigger
on which the conversation starts. Then there are certain choices you can make and the system will present to you.

If have the following conversation "responders". Ways of communicating with the user:
- Speech input and speaker output (asking by text-to-speech)
- UI input (buttons) and UI output (did you see the list of choices on the screen?)
- Chat
- Lists (facebook stream, twitter stream, IRC, etc)

The code for the event I showed you was the following (to give programmers an idea on what was done).
Based on this definition the system knows how to interact with the user and all devices in the livingroom.

Code: Select all

$r[] = Rule2_Conversation::create('Zaken aanzetten na de ochtend in de week')
	->on(new Rule2_Comparator_ValueBased('variable.cominghome', ReadingModel_Boolean::NO)) // getting home!
	->when(new Rule2_Comparator_EqualTo(new Rule2_Operand_TimePart(new Rule2_Operand_Time(), Rule2_Operand_TimePart::DAYOFWEEK), array(1,2,3,4,5 /* monday - friday */)))
	->when(new Rule2_Comparator_TimeBetween(901, 2359)) /* after the between six and nine in the morning */
	->addConversationResponder(new Rule2_ConversationResponder_Speech())
	->addConversationResponder(new Rule2_ConversationResponder_UiReply())
	->addChoice(Rule2_Conversation::create('televisie')
			->then(new Rule2_Action_SayOnSpeaker("alsa.behindtvspeaker", "Ik zet de televisie aan voor je."))
			->then(new Rule2_Action_InfraredSequence("turn.on"))
	)
	->addChoice(Rule2_Conversation::create('muziek')
			->then(new Rule2_Action_SayOnSpeaker("alsa.behindtvspeaker", "Ik zet muziek aan voor je."))
			->then(new Rule2_Action_PlayContentOnSpeaker("xbmcspeaker.livingroom", "shoutcast.sctrance", "Trance/351954", SpeakerProvider::CAP_PLAY_NOW))
	)
	->addChoice(Rule2_Conversation::create('nieuws')
			->then(new Rule2_Action_RunRule(GET_LATEST_News))
	)
	->then(new Rule2_Action_SayOnSpeaker("alsa.behindtvspeaker", "Wilt u nieuws, televisie of muziek?"))
	->then(new Rule2_Action_Log());

Re: Cheap voice recognition integration

Posted: Sun Mar 10, 2013 11:18 am
by rbmace2403
Looks very cool, options are endless with this system. But how have you solved the limit that google wont allow scripting to use the translate system? And the maximum of 100 chars text 2 speech if she is reading the News for you?

Have you used jquery mobile as a framework for your touchpad?

Re: Cheap voice recognition integration

Posted: Sun Mar 10, 2013 11:46 am
by AshaiRey
I had a look also into this and one very big plus is that it also (i presume) will accept Dutch as commands.
It needs Chrome v25 or higher the work. I tried this on a android tablet but didn't get this to work yet. Has someone tried this already and succeeded?

Re: Cheap voice recognition integration

Posted: Sun Mar 10, 2013 1:12 pm
by airox
@rbmace2403:
I cut text into multiple parts and send them seperately towards the text-to-speech service. I'm sending along a different user-agent header when fetching the speech mp3. I also cache the mp3's on my local disk. I have a lot of text (security is turned off, on, putting the music on, have set the room in movie mode, etc) which don't have to go through the speech service. I never walked into any limits till now...

@AshaiRey:
Yeah, the main benefit is that you can set the text-to-speech as well as the speech recognition to dutch. Both are of very good quality I have noticed.

Code: Select all

recognition = new webkitSpeechRecognition();
recognition.continuous = true;
recognition.lang = "nl_NL";

Code: Select all

wget -U "Lynx 1.2.3.4" "http://translate.google.com/translate_tts?ie=UTF-8&q=testme&tl=nl" -q -O- | mpg123 -q -

Re: Cheap voice recognition integration

Posted: Sun Mar 10, 2013 1:15 pm
by airox
Oh and I haven't used jQuery mobile for the interface. Because all the "widgets" on the screen are all custom made I decided to create my own.
The main web interface is build using bootstrap from twitter. This could also be nice for a touchpad interface.

Re: Cheap voice recognition integration

Posted: Tue Mar 12, 2013 12:15 am
by rbmace2403
Ok thanks for the info, i will try to make it work the next months, currently building my new frontend in jquerymobile. Looked at bootstrap too but found jquery mobile easier for moble devices because of the extra modules built in.

Rb