Not a member yet? Register for full benefits!

ACME - Augmented Virtuality or Virtual Augmented Reality

The ACME project, perhaps unfortunately named similar to the cartoon company that produces zany inventions, is actually a collaboration between the VTT Technical Research Centre of Finland, IBM Research and Nokia Research. It is a serious attempt at bringing disparate technologies together to form something we have all desired for a long while - an interoperable environment in which physical and virtual mix (almost) seamlessly.

Standing for Augmented Collaboration in Mixed Environments, ACME is precisely that. It allows users in a virtual world to interact with users in multiple physical space locations, with all users seeing one another and interacting with one another. You can even frag a bot, with no physical body, into the group and its just another person.

The technology is simple and straightforward really when you think about it. It is a combination of the separate technologies in AR and VR that have been maturing towards this point. For physical meeting rooms the physical location of everything is scanned: the location of the user and the location of the meeting-space to be specific. The meeting-space being a flat, clear area large enough to use as a shared workspace. It is identified via magic symbol technology - standard to AR.

The user's position is identified via either facial recognition and tracking software used in conjunction with a web cam, OR the position and orientation sensors built into a HMD unit. An alternative is necessary for the HMD because of course, it covers the eyes.

In this initial set-up situation, you can clearly see the two magic symbols that mark the opposite ends of the meeting-space, and that both participants of this meeting are wearing HMDs each controlled by the laptop close to them. It is an odd set-up for a meeting of two people.

All physical locations are in addition, fitted out with a mocap camera system. Which system this is does not actually matter, so long as the data is transmitted. It is used to accurately convey the user's gestures.

With all of this in place, it makes sense then that as all data is transmitted virtually, that it be transformed into a 3D virtual environment, to stpore and represent relative locations. In fact, this is exactly what is done.

The prototype uses an open source viewer from Linden Lab's Second Life virtual world, as well as from open source ARToolkit and OpenCV libraries. The use of open source software was of course critical to lowering the development costs, and it also has the pleasant side-effect of rendering all the code public domain.

It is worth mentioning that although Second Life software was used, the company's own viewer with all its bugs and glitches, was not. In reality, any sufficiently dynamic virtual environment could be used in this manner.

The virtual environment has an avatar for each person present. It also represents objects (each themselves physically scanned or wholly virtual constructs) which are to be used during the meeting. The system is set up thus, so that those not physically present, are depicted by a SecondLife style avatar (of their choosing). The avatar appears at the table if the person is not present at a given physical location, and thanks to the mocap, mirrors the physical movements of the other party.

A third person in a remote location has just joined the meeting. By logging in to the virtual environment, they are now present in AR at the physical meeting, and can interact through their avatar with natural body movement.

This of course means that if the other party gets up to leave, the avatar does as well. The system is not perfect - it is twitchy, and has around a 200ms lag. Still, it is bearable. Also present are any irtems that were tagged and placed at any one of the physical locations - or even just placed in the virtual environment. The system replicates them into the virtual environment, and that virtual environment is overlaid on all participating physical environments. This is why it cannot be over emphasised that the designated meeting-space at every physical location be clear of clutter and clean.

Microphones at each physical location puck up and relay sounds, again to all other locations, physical and virtual. The HMDs or web-cams track head movement of the participants, and the mocap cameras track all over movements. Items placed in the meeting-space are interactable from all locations. So a person in meeting room A can place down a book, a person in meeting room B can pick it up, move it, and people in meeting rooms A, B, and C all see the second person's arm move, pick up the book, move it, and set it back down.

Currently, the system is at proof of concept stage. Further development is proceeding, but funding is not yet secured.



AR Development > ARToolKit

Augmented Reality Basics: Magic Symbol


Mixed Reality Video Conference (ACME Homepage) (PDF)

Researchers From IBM, Nokia and VTT Bring Avatars and People Together for Virtual Meetings in Physical Spaces

-VTT Augmented Reality Team

Staff Comments


Untitled Document .