DESCRIBING objects is so much easier when you use your hands, the classic being "the fish was this big".
For humans, it's easy to understand what is meant, but computers struggle, and existing gesture-based interfaces only use set movements
that translate into particular instructions. Now a system called Data
Miming can recognise objects from gestures without the user having to
memorise a "vocabulary" of specific movements.
"Starting from the observation that
humans can effortlessly understand which objects are being described
when hand motions are used, we asked why computers can't do the same
thing," says Christian Holz
of the Hasso Plattner Institute in Potsdam, Germany who developed the
system with Andy Wilson at Microsoft Research in Redmond, Washington.
Holz observed how volunteers described
objects like tables or chairs using gestures, by tracing important
components repeatedly with their hands and maintaining relative
proportions throughout their mime.
Data Miming uses a Microsoft Kinect motion-capture camera to create a 3D representation of a user's hand movements. Voxels,
or pixels in three dimensions, are activated when users pass their
hands through the space represented by each voxel. And when a user
encircles their fingers to indicate a table leg, say, the system can
also identify that all of the enclosed space should be included in the
representation. It then compares user-generated representations with a
database of objects in voxel form and selects the closest match.
In tests the system correctly
recognised three-quarters of descriptions, and the intended item was in
the top three matches from its database 98 per cent of the time. Holz
presented his findings at the CHI 2011 meeting in Vancouver, Canada, in May.
The system could be incorporated into
online shopping so users could gesture to describe the type of product
they want and have the system make a suggestion. Or, says Holz: "Imagine
you want a funky breakfast-bar stool. Instead of wandering around and
searching Ikea for half an hour, you walk up to an in-store kiosk and
describe the stool using gestures, which takes seconds. The computer
responds immediately, saying you probably want the Funkomatic Breakfast
Stool-o-rama, and it lives in row 7a."