Virtual Gardener

The ‘Virtual Gardener’ (depicted below) is a very simple web-based 2D augmented reality game engine.  Using color tracking in Flash, players can tend a virtual garden and grow a small patch of flowers by waving around a physical object in front of their webcam.

waterin' the virtual plants

waterin' the virtual plants

The game is currently a very rough prototype and is missing all sorts of niceties (i.e. sound effects, scoring mechanisms, etc.) – hopefully I can refine this framework so that it can serve as the basis for a host of color-drive 2D AR games.

If you’d like to try it yourself feel free to click on the ‘Start’ button below.  You will need a webcam and a solid colored object about 2″ by 2″ in order to play.  When prompted, simply hold up this object so that it fits within the small rectangle in the middle of the screen.  Once the object appears within this region, click on the button that appears to cause the game to “memorize” the color value of your object.  From here you will be able to wave your object around in front of your webcam as though it was a mouse cursor.

Teaching in Cairo

Wow, it seems like a while since I’ve posted here – over the last few weeks I’ve been busy teaching a course on educational technology here in Cairo, Egypt for The College of New Jersey’s overseas program.  I’m working with a great group of K-12 school teachers and administrators – I wish I could stay longer!

Hanging out by the pyramids of Giza on my day off!

Hanging out by the pyramids of Giza on my day off!

Whisper Deck Voice Control

I’ve gotten a few e-mails over the past few days regarding the voice control aspect of the Whisper Deck and how it works.  Here’s a brief overview of how I was able to incorporate speech as an input mechanism for augmented reality models in Flash.

The voice control system that I created is based on a client-side software package called “MacSpeech Dictate.” Voice recognition works as follows:

  1. Launch the MacSpeech Dictate recognition engine
  2. Place cursor focus in a text box at the bottom of the Whisper Deck interface.  This text box is not visible in the Youtube demo of the project.
  3. Speak into the microphone.  Recognized words are transcribed by MacSpeech Dictate and placed into the text box in the Flash movie.
  4. Flash listens for an Event.CHANGE event to fire on the text box.  When it does, it starts parsing the text that was transcribed by MacSpeech Dictate.  Here are the general steps my parsing routine goes through:
    1. Convert the entire spoken string to lowercase (”this.mytextfield.text = this.mytextfield.text.toLowerCase();“)
    2. Parse out any leading spaces
    3. Split the transcribed sentence into individual array elements based on the placement of spaces (i.e. “hello world how are you” would parse out to a new array with the following elements
      1. myarray[0] = “hello”;
      2. myarray[1] = “world”;
      3. myarray[2] = “how”;
      4. myarray[3] = “are”;
      5. myarray[4] = “you”;
    4. Look for the last element of the array to be the world “over” – this is used as a trigger to tell Flash to process the command in its entirety.
    5. If the “over” command is present, look at the first element of the array.  Current the Whisper Deck can recognize two commands (”search” and “compare”) – if either of these commands is present, pass the command to the appropriate AR rendering class.

Originally I did not include the “over” keyword as part of the system – instead I used a period of microphone inactivity as a cue to tell the program that I was done speaking.  Unfortunately this did not provide very stable results – machine transcription, even under quiet conditions, is fraught with errors, which led to a lot of inaccurate searches.  “Over” was included as a safety buffer to let me “proofread” my voice command before I asked the program to process it.  It works well for demo purposes, but I can see that it’s a limitation of the system that I will need to work out if the project was to move forward.

At some point I would love to play around with a web-accessible machine translation routine, similar to what Didier Brun has accomplished in the video below.  Unfortunately I was pressed for time on this application, and MacSpeech Dictate worked very well given the design requirements for this project.

Voice Gesture from didier.brun on Vimeo.

The Whisper Deck

Overview

The Whisper Deck is a voice-controlled augmented reality data visualization tool that immerses users within a fluid information ecosystem of their own design.  The project is an experimental interface that explores new ways in which we can examine the vast amount of data being generated by the world on a daily basis.

Visualizing google trends via the Whisper Deck

Visualizing google trends via the Whisper Deck

Video

Description

Using an off the shelf Vuzix Cam-AR head mounted display, users can look around their local environment and examine the world through the integrated webcam unit on the front of the display.  Upon noticing a pre-defined symbol, a 3D world instantly appears.  As long as this symbol remains in view, this newly created augmented space will continue to persist and will allow the user to examine it from any direction by simply moving around it in real space.

Users can issue requests to the Whisper deck using a series of voice commands.  These commands will cause the world to reconfigure itself based upon your preferences.   For example, if you would like to have the world gather information about a topic you are interested in – say, Boston Terrier Puppies – simply say the command “search Boston terrier puppies” with the keyword “over” at the end of your sentence.  The system will go out to the Internet and retrieve information relating to your request, including a spoken definition from Wikipedia as well as a set of images from various publicly accessible image search engines.

Image Search using the Whisper Deck

Image Search using the Whisper Deck

In addition, the Whisper Deck also allows visitors to compare the relative popularity of search term by interfacing with Google Trends.  Speaking the command “compare” will allow you to name any number of terms which will be visualized as a 3D bar chart that can be further inspected.

Technology

The Whisper Deck uses a number of different tools.  While most of the technologies described below are web-friendly, the voice controlled aspect of the system is handled via a desktop speech to text package.

  • Flash ActionScript 3
    • FLARToolkit (marker detection)
    • Papervison 3D (3D rendering)
  • Web Services
    • Yahoo!  Pipes (Flickr, Picasa & Google Images feed aggregation)
    • Perl + Python (Google Trends integration)
  • Voice Recognition / Playback
    • Mac Speech Dictate
    • Perl + integrated Apple OS X text to speech engine

Augmented reality for digital storytelling

Over the past few months I’ve been working to create an authoring environment that help kids take advantage of augmented reality to help construct rich, 3D spaces in which they can tell stories in a fun, playful way.

Augmented Gnomes!

Augmented Gnomes popping out of a sheet of paper!

While it’s not fully completed, I do have a working version that lets you construct simple scenes.  The current version can do the following:

  • Add 2D planes to a 3D space
  • Texture these planes using transparent PNG files
  • Orient objects and adjust rotation and scale
  • Create dynamic “cut-out” shadows based on the original material
  • Handle timing to allow items to pop out in a specific sequence
  • Save the file to an external server for permanent storage
  • Reconstruct the scene in augmented reality

Here is a brief video that showcases the pop-up authoring environment as well as a finished product in action.

If you’re interested in seeing portions of this project in action, feel free to stop by the ITP Winter Show this weekend.  I will be showing off a voice controlled augmented reality project called the “Whisper Deck” which incorporates my AR pop-up books as one of its many interactive features.