Interaction Between Real And Virtual Humans In Augmented Reality

Interaction between Real and Virtual Humans in Augmented Reality
Selim Balcisoy and Daniel Thalmann
Computer Graphics Laboratory, Swiss Federal Institute of Technology
EPFL, DI-LIG, CH- 1015 Lausanne, Switzerland
(ssbalcis, thalmann}@lig.di.epfl.ch
we propose a system to overcome these limitations by
using the augmented reality(AR) technology.
We can summarize the AR as a combination of
distinct technologies spanning from virtual reality to
computer vision. By definition AR enhances the user’s
view of the real world with visual information from a
computer. In our case we need to enhance the virtual
humans’ synthetic vision or other sensors with
information obtained form the real world also. One
possible solution is to acquire data from the real world
through a high level interface from a computer vision
system with one or more cameras. This vision system
should obtain essential data from the real world, and
transform it into a machine understandable form. Simply
put, the vision system should sample the real world.
There are several examples of such interfaces, mainly 2D
or 2 1/2D vision systems to track human body or head
motions and gestures [5][11]. Our intention is to
implement an open software system where the high level
interfaces can be used by different input sources such as a
human operator or a fully integrated 3D vision system.
As implementing a fully operating 3D vision system is
an advanced research topic in itself, and as we are not
interested in implementing a 2D vision system with
major limitations, we decided to use a human operator
to obtain information from the real world.
We use a single static camera to obtain a color
image of the real scene with a human actor. A human
operator feeds the interaction system with necessary
input data by using 3D input devices, and a high end
graphics workstation renders the synthetic world with
virtual actors in correct correspondence to the real world.
This synthetic image is merged with the image from the
real scene. The resulting output images are displayed on
a large video monitor facing the human actor in the real
scene for visual feedback. We developed two different
softwares to design a virtual scene and to calibrate a
single video camera.
Under question of interaction our central issue is to
let virtual humans suitable for acting just like a human
actor on a theater stage. Human actors are following a
text based script, and portraying a character on the stage.
They have the ability to understand this script and
perform the acting with some improvisation. Their
virtual colleagues have no real cognition and therefore
are unable to read and understanding a written text. In
our case we let our virtual humans follow a strict
Abstract
Interaction between real and virtual humans covers
a wide range of topics from creation and animation of
virtual actors to computer vision techniques for data
acquisition from real world. In this paper we discuss
the design and implementation of an augmented reality
system which allows investigation of d@erent real virtual interaction aspects. As an example we present
an application to create real-time interactive drama
with real and virtual actors.
1 Introduction
Though virtual human models have been in
existence for the last few years mainly for research
purposes to simulate human movements and behaviors,
only recently, there has been very encouraging interest
from outside the academic world. Virtual humans have
potential applications in entertainment and business
products such as films, computer games, and distributed
virtual worlds; in populating empty 3D worlds, or
representing a remote participant in a distributed virtual
world, or as TV a talk-show host. New applications are
demanding new ways of interaction between real and
virtual humans as we will investigate in this paper.
Until now virtual humans have been ‘living’ in
homogenous virtual environments. State of the art
virtual environments are human designed worlds with
low level of detail compared to our real world. T o
achieve higher immersive experiences we need to reduce
some basic limitations of the current virtual reality
technology:
Rendering of photo realistic, detailed and interactive
environments in real-time.
Although the current computer graphics technology
can model, animate and render the human figures with
near photo-realism in real-time[6], we cannot say the
same about rendering of a complex virtual world.
Usage of restrictive human machine interfaces like
gloves, magnetic trackers, head mounted displays.
Current human-machine interfaces with excessive
connections are hampering the interactivity. Another Eict
is that they limit the usage of the virtual reality
technology for a wide range of applications. In this paper
31
1087-4844/97 $10.00 0 1997 IEEE
organic body to be modeled with SkeletonEditor.
Metaballs are used to approximate the shape intemal
structures which have observable effects on the surface
shape. Each metaball is attached to its proximal joint,
defined in the joint local coordinate system of the
underlying skeleton which offers a convenient reference
frame for positioning and editing metaball primitives,
since the relative proportion, orientation, and size of
different body parts is already well-defmed.
The designed body shape and face models are
integrated using a real-time body deformations library,
DODYLIB [12], and IRIS Performer graphics toolkit
[SI. DODYLIB is built on the HUMANOID software for
animating and rendering virtual humans in real-time on
SGI Workstations. The IRIS Performer allows textures
on 3 0 surfaces. By using our model configuration, we
can apply different textures on each body part to
simulate simple clothing. Using hardware texture
mapping, performance is same as without texture, but
applying texture is a good way of enhancing surface
realism.
machine understandable script in order to behave like
human actors.
We designed and implemented a novel software
architecture to verify all our concepts. This architecture
integrates several existing input devices, such as
Spaceball, and has interfaces for hture possible
extensions like 3D vision systems. The virtual human
creation and animation is based on the existing
HUMANOID 2 ESPRIT European project software [2].
A software layer to create task oriented scripts has been
developed over the HUMANOID Agent structure [3] to
produce virtual actors. Finally, in an example sequence
virtual humans are integrated into a real theater stage as
virtual actors using the augmented reality technology.
2 Creating and animating the virtual
humans
Currentiy there are several virtual human creation
and animation software sets like: Marilyn from Swiss
Federal Institute of Technology in Lausanne (EPFL) &
University of Geneva (UG), or Jack from University of
Pennsylvania (UPENN). Our virtual human creation and
animation software, Marilyn, was partly developed in
the framework of the HUMANOID Esprit project. One of
the objectives of this European project is to create
virtual humans with deformable body, face and hand.
Another objective is to achieve agent controlled human
figure animation. In this chapter we briefly present the
basic procedure for creating a virtual human using the
our software tools, and then present some human figure
animation modules which are currently used in our
system.
2.2 Body animation for the virtual humans
HUMANOID environment supports many facilities
for body animation. In this paper we used a subset of
these, which were adequate for performing basic actions.
We can analyze virtual human motions under two
distinct groups: robotic movements to perform low level
tasks like locomotion of the body from one point to
another point, and gestures to express current state of
mind of an actor. We used motion motors to perform
low level tasks. For gestures we used a set of keyfi-amed
animation sequences.
Motion Capturing and Predefined postures
A traditional way of animating virtual humans is
playing keyframe sequences. We can record specific
human body postures or gestures with a magnetic
motion capturing system and an anatomical
converter[7], or we can design human postures or
gestures using the TRACK system[l].
Motion capturing can be best achieved by using a
large number of sensors to register every degree of
freedom in the real body. Molet et a1.[7] discuss that a
minimum of 14 sensors are required to manage a
biomechanically correct posture. The raw data coming
from the trackers has to be filtered and processed to
obtain a usable structure, The software developed at the
Swiss Federal Institute of Technology permits
converting the raw tracker data into joint angle data for
all the 75 joints in the standard HUMANOID skeleton.
TRACK is an interactive tool for the visualization,
editing and manipulation of multiple track sequences.
To create an animation sequence we create key positions
of the scene, store the 3D parameters as 2D tracks of the
skeleton joints. The stored keyframes, from the TRACK
system or magnetic tracker, can be used to animate the
virtual human in real-time. We used predefmed postures
2.1 Design and rendering of virtual humans
The creation of a virtual human is performed in two
separate parts: the design of the hce and of the body
shape. Later the face and the body shape are integrated
using a real-time deformation software library.
For the face, the operations conducted in a traditional
sculpture can be performed by computer for computer
generated objects. Our sculpting software is based on the
Spaceball, a 6D interactive input device. This allows
the user to create a polygon mesh surface. Local
deformations based on an extension of FFD [9] are
applied while the Spaceball device is used to move the
object and examine the progression of the deformation
from different angles. Mouse movements on the screen
are used to produce vertex movements in 3D space f?om
the current viewpoint. Local deformations make it
possible to produce local elevations or depressions on
the surface and to even out unwanted bumps once the
work is near completion.
For the body shape, we use an interactive metaball
editor, BodyBuilder [lo], for shape designers. We start
the shape design by fwst creating a skeleton for the
32
and gestures to perform realistic hand and upper body
gestures for interpersonal communications.
Motion Motors
We used two motion motors, one for the body
locomotion and the other one for the placement of the
endeffectors which are hands and foot in our case. The
first one is a walking motor developed by Boulic[2],
and the second one is a inverse kinematics motor
developed by Emering[3].
Current walking motor enables virtual humans to
travel in the environment using instantaneous velocity
of motion. One can compute walking cycle length and
time from which necessary skeleton joint angles for
animation can be calculated.
The inverse kinematics motor defmes several chains
in the body skeleton (left arm, left leg, etc.) for end
effectors, and performs inverse kinematics on these
chains. This motion motor outputs skeleton joint
angles also.
We can investigate our Augmented Reality System
in three subtopics. The Figure 2 presents the data flow
diagram with those three subtopics: Data acquisition
from the real world, real-time processing and rendering,
and compositing.
9
1””1
Camera
-
device
0
Viiual Scene
Actor Guidance
Rendering
3 Augmented reality system overview
3.1 System configuration
We designed our system using off the shelf,
commercial equipment. The principal hardware platform
is a Silicon Graphics OnyxTM Realily EngineZTM
(Onyx) graphics workstation with four MIPS R4400
processors, a single graphics pipeline, and a Sirius
VideoTM (Sirius) real-time video capture device. The
Sirius acquires image sequences from a Sony HyperHad
Camera on its analog input. The graphics engine of the
Onyx generates the virtual scene, and both the image
sequences are chroma-keyed by the Sirius. The output of
the Sirius is distributed to a Digital Belacam, a stage
monitor for visual feedback for the real actor, and a user
monitor close to the Onyx for operator. The motion
capturing for keyfiamed sequences is performed with an
Ascension Technology Flock of BirdsTM magnetic
tracker with ten sensors and extended emitter. Figure 1
shows our hardware configuration.
Video Monitor
Legend
Figure 2. Data flow diagram
Data acquisitionfrom the real world
A flawless combination of real and virtual worlds is
only possible if the virtual world obtains information
about the current state of the real world in terms of
positions of the real objects and humans, and lighting.
This information needs to be acquired as and when it
changes in time. For static objects and most of the time
for lighting, we just need to obtain the necessary data
only once. For moving objects and for human actors we
need real-time data acquisition from the real world to
update the virtual world.
In our system information about real objects, and
lighting are measured on the real scene and entered into
the application in the setup phase. Our AR system has
currently one real-time input source from the real scene
namely a video camera to sample the real world.
Camera Calibration
A correct correspondence between the real and virtual
worlds can be achieved if the virtual camera
characteristics exactly match with their real camera
counterparts. We choose a pin hole camera model with
6 extrinsic and 9 intrinsic parameters. The extrinsic
9
Figure 1. Hardware configuration
3.2 System operation
33
input from the operator and other users are processed
simultaneously to determine the current state of the
virtual actors and objects. Operator and other users have
access to several different input devices. The data fi-om
such input devices are transformed to joint angles to
animate the virtual actors or to 3D coordinates to
transform virtual objects.
The rendering is implemented on the IRIS Performer
toolkit to achieve the highest possible framerates on
Silicon Graphics workstations. In our application
rendering accounts for 70% of the delay between two
frames. To achieve higher fi-amerates and higher degree
of realism we applied several well know tricks like:
Low polygon count for background objects,
and mapping rich textures on them.
Adjustable Level of Detail for different kind of
applications. In DODYLIB we can select three different
levels of detail for virtual humans. We choose low level
of detail for computing demanding applications to keep
the framerate high enough.
Implementing a synthetic “fake” shadow for
the virtual actor by projecting the shape of the body on
the ground.
Usually a single virtual actor with the highest
level of detail and a synthetic shadow can be rendered
faster than 10 frames per second.
Compositing
To mix video sequences from our graphics engine
and camera, we use conventional chroma-keying
technique with some modifications. In chroma-keying
technique an intermediate color, acts as a “keyhole” and
determines where one video layer shows through
another. Compositing is performed by the Sirius, which
has a blender and a chroma-key generator. Currently
commercial video production studios are using the blue
room technique for chroma-keying, which works well for
placing human actors in virtual sets. This approach has
some major limitations like:
Setting up a dedicated studio.
Usage of real background or real objects is
limited by keying color.
We decided to design the virtual scene as the blue
room. In our case black is the keying color, which
allows us to use any room as a studio. On the other
hand the black color is rarely used to render 3D objects
and actors, which fi-ees us fiom any design limitations.
Figures 3 shows the virtual scene with virtual human.
Real objects, a table and a chair, are represented in the
virtual scene as 3D models in black color. Virtual
objects which augment the real world are rendered in full
color. Figure 4 presents the result of merging real and
virtua scenes. The real objects correctly occlude the
virtua objects and virtual human.
parameters are three for the position and three for the
orientation of the camera. The intrinsic parameters are:
Lens distortion: three for radial two for
tangential distortion.
Camera interior orientation parameters: two
principle points and one principal distance (focus
length).
Electronic Influence: one for the x, one for the
y distortion of the CCD array.
To determine these parameters we used a standard
resection algorithm [4]. The calibration process
demands well known points in the image. Common
approach is to have landmarks or a grid on the scene.
Instead of some landmarks or grids we use furniture,
chair and table, to obtain some well-known points in
the real scene. We can summarize our objectives for
such an approach under two topics:
As our operating area is large, we cannot use
small grids which are commonly used for several
applications [ 111.
Landmarks are inflexible and will distort the
realism on the stage.
As we have a static camera position, we do not need
any further complex camera tracking algorithms. The
system passes the camera parameters to the rendering
engine as virtual camera parameters to perform the
perspective transformation to the computer generated
images.
Until now we discussed off-line operations like
camera calibration. During an interactive session we
have human actors and some real objects in locomotion.
The changes of position and orientation should be
registered by the system in real-time also. As mentioned
before there are some systems with limited performance
using computer vision. We decided to carry on with one
or more human operatorshers as in a virtual reality
application.
User input
The input from a user is restricted to guiding a
virtual human. He can interact with real or virtual
objects through his representation in the virtual world
namely an avatar. In our current system we implemented
software interfaces for a remote user for guiding a virtual
human with Spaceball and keyboard. In near future with
proper hardware and software a remote user will be able
to participate through Intemet with a PC or through
POTs(Plain Old Telephony) with a touch tone
telephone.
0
Operator input
An operator uses several 3D input devices to modify
a virtual world according the changes in a real world.
Operator can move virtual objects, guide virtual
humans, or enter high level commands for managing a
script. The operator uses the same virtual human
guidance tools as a remote user.
Real-time processing and rendering
Real-time processing of the data and rendering of the
virtual scene is implemented on an Onyx. Real-time
34
animation modules, like motion guidance, object
transformation, activation of pre-recorded animation
sequences. Our application is connected to a file system
which contains scene data about 3D models and virtual
humans. This data is read during the initialization of the
system.
The virtual humans are animated with the library
AGENTlib [ 3 ] dedicated to the coordination uf
perception and action and the combination of motion
generators. The AGENTlib defines agent entities
responsible for the integration of a fixed set of perception
senses and a dynamic set of actions. The Core integrates
the DODYLIB with the AGENTlib. The DODYLIB is
responsible for real-time body deformations and
integration of the textures for rendering. Finally the
virtual scene is rendered using the IRIS Performer
toolkit.
script data
r-xGF-
I
intefices
Visual data
C,
-D
IAGENTlibl
Figure 5. Software architecture block
diagram
Figure 4. Merged image
4. Software architecture
We defined three different data structures in the Core
of our software application. The Figure 6 presents the
Core data structures and their intemal connections.
In modeling interactions with virtual humans the
key challenging issues lie in combining several different
topics like artificial intelligence, computer vision and
virtual reality. On the other hand human-machine
interfaces should be designed to let human users perform
similar tasks with different input devices or mediums. A
possible scripting ability should be considered too, to
create long thematic animations. To integrate all these
requirements and to provide an open system we propose
the following software architecture with several layers,
where the whole animation sequence can either be
scripted or interactively driven.
The Figure 5 presents a block diagram of our
software architecture. We have two interface modules far
two distinct input sources: One interface module far
virtual human guidance and another one for acquiring
visual data from the real scene. The input from the real
world in the form of 3D device signals is processed by
dedicated high level device interfaces. The visual input
can be obtained by a computer vision system or by a
human operator. This module should update the Core
continuously about the current state of the real world.
The Core directs output of the interfaces to specific
Figure 6. Core data structures
The ACTOR sets up the connection to DODYLIB
for body representation and to the AGENTlib for motion
control. The ACTOR data structure contains essential
information about the virtual actor concerning rendering,
interaction capabilities, current position and state. The
primarily state is IDLE, where the ACTOR is not
35
I
managed by any TASK. Depending on the TASK an
ACTOR can have several distinct states. For example 8'
a user is guiding an ACTOR with a Spacebar the
ACTOR cannot be manipulated by another TASK far
body locomotion, but can perform inverse kinematics for
the left hand, which should handled by some other
TASK.
The second one is OBJECT, which manages the
rendering information, current position and state of a
virtual object. Again OBJECTs have similar but more
simplified states. A TASK sets the state of a virtual
object to OCCUPIED to perform picking and to FREE
after performing the task. Some static OBJECTs have
permanent STATIC state if they represent a real object.
The last data structure is the TASK, which is more
complicated than the former ones. The TASK data
structure contains rules for specific interactions, provides
connections between ACTORs and motion generators
and user interfaces. A TASK may control several
ACTORs and OBJECTs. It manages the interactions
between ACTORs and OBJECTs, and allows device
drivers to control virtual humans. TASKS may call
other tasks sequentially or concurrently. We use the
ACTION structure from the AGENTlib to manage
concurrent and sequential TASKs[3]. We defined several
low level tasks for user interfaces and motion generators.
As an example we implemented a high level task for
drinking a cup of tea while sitting on a chair, which we
term DRINK. This task requires connections to an
ACTOR and to several OBJECTs: chair, table and cup.
If the DRINK task is activated, this task checks if the
virtual human is sitting on the chair, and whether this
chair is close to the cup. For simplicity, we assume that
this cup is on the table close to the chair. The actor
activates a low level task REACH to reach this chair.
This low level task is controlling the walking motor to
reach a given point with a given orientation. The
REACH task checks continuously if the actor has
reached the chair. After reaching the chair successfully,
another low level task, SIT, is activated to let the
virtual human sit on this chair. This is a k e y h e
player to animate the ACTOR from standing posture to
sitting posture realistically. After finishing the sitting,
the DRMK task checks the current position of the cup
and one of the hands of the virtual human may move
according to the position of the chair. The DRINK tasks
activates the inverse kinematics motion generator to
reach the cup with a certain orientation. Afterwards the
I
I
I
I
IDLE
I
I
drink
I
I
I
Figure 7. A high level TASK example:
DRINK
Figure 8. Virtual actor performing
DRINK
At the highest level we have the animation loop,
where we implement a script like animation flow. The
scripting is implemented in two ways: state driven and
time driven. We can consider the animation loop as a
metatask, where the goal is to perform a script. Creation
of several detailed tasks enables us to define complex
behaviors with multiple parameters. This complexity is
important in performing realistic behaviors.
5 Interaction
virtual human holds the cup, the head is set as the new
By interaction, we mean triggering some meaninghl
actions in a virtual human in response to body, social
gestures or verbal output from a real human. As our AR
system does not acquire audio sequences fiom the real
world, verbal communication remains out of the scope
of this paper.
The interaction between real and virtual actors has
several problems to be solved. One key issue is the
human machine interface. In our AR application we let
target for the inverse kinematics module, and the cup is
left back on the former position. The Figure 7 shows the
sequential task management at the DRMK task. The
Figure 8 is a snapshot of the performance of DRINK task
action.
36
the virtual humans share the same stage with the real
humans. In this case the human actors cannot use any
3D input devices or magnetic sensors which will reduce
the realism of an application. One possible solution is
using computer vision [5].We choose to let a human
operator manage the interactions between the real and
virtual actors according to a scenario. We can modi@
our system and let participants use input devices f a
different kinds of applications.
In this chapter we will investigate interactions
between real humans and virtual objects or humans in
general terms with a reference to current state and
possible future improvements of our system.
postures for virtual humans. To ease the transition
between several postures we propose to create whole
gestures. We define a gesture as a combination of
postures to express a specific state of mind, like
disagreement. Gestures can be created for each virtual
human separately to give them a unique character. On
the other hand we can create a repertoire of gestures for
long interactions like a discussion, and choose any
gesture randomly to animate the virtual human. The
management of gestures can be done by autonomous
agents [3] or a human operator. To test the acceptance of
such an interaction technique with a virtual human we
prepared a simple application based on our augmented
reality system. We created a set of gestures using the
TRACK system. The gestures contained several basic
arm and head movements which we can investigate
during a normal conversation such as shaking the head.
We entered this set into our software environment as
keyframe sequences. An operator is using a keyboard to
trigger appropriate ones. Our test environment is a real
table, two real chairs, one for a real human and the other
for a virtual human, and a video monitor facing the real
human. As a participant sits on his chair a virtual
human appears on the monitor sitting on the other chair.
According to the actions of the participant an operator
triggers some recorded keyframe sequences. In the
beginning the test persons had difficulties in watching
the video monitor all the time. We had to adjust the
place of the monitor several times. For the realism of the
interaction we got positive responses. The test persons
enjoyed this new kind of experience. We can list some
of the interesting results:
The video monitor, if it is well placed and large
enough, is not a major disturbance factor. If the head
height of the virtual human and the position of the video
monitor are close to each other, test persons experience
the similar feeling as if looking at a real person. We
observed that orientation in a synthetic world is much
easier with such a setup. In many VR applications the
user looks into a virtual world fiom the first person or
third person perspective. In our case he can see the
whole mixed world including himself.
Although we had a limited set of recorded gestures,
the test persons were not frustrated. The operator
selected sometimes wrong gestures on purpose or
interrupted a continuing gesture and started another one
immediately to surprise the test persons. Such semirandom actions added realism to the whole experience.
According to this result we will add some
improvisation possibilities for the autonomous virtual
humans during interactions for future applications.
The Figures 9 and 10 are snapshots fiom a live
demonstration. According to a script the real human and
virtual human hold a discussion in a bar. Finally they
cannot argue anymore and the virtual human leaves the
synthetic stage.
5.1 Object manipulation
In our application virtual humans are ;able to interact
with virtual objects or with the virtual representations of
real objects. Considering the trivial fact that a virtual
human cannot move or deform a real object, interactions
between real objects and virtual humans have some
limitations. If real objects are static ones like a table or a
chair, a virtual human can perform tasks like sitting on a
real chair or putting a virtual cup on a real table. Virtual
humans use representations of real objects, in the virtual
world to interact with them. With well known
coordinates and a 3D model of a static object, a virtual
human can go close to this object and interact with it. It
is quite difficult even for a human to determine exact
coordinates of a moving object. Virtual humans have
more difficulty in handling real objects in motion. T o
interact with moving real objects like a flying ball, we
definitely need a 3D computer vision system, capable of
tracking moving objects in 3D.
Real humans have similar limitations like their
virtual counterparts. They can use haptic VR devices to
touch a virtual object, or get an immersive 3D visual
feedback fiom a head mounted display. But such
techniques are not useful for potential applications we
are considering in this paper. A possible solution is to
use computer vision techniques to register the hand
movements of a participant, and then defoim or translate
virtual objects according to these movements. The
interaction between real humans and virtual objects is
currently performed by an operator, who tracks the
movements of the real human and translates desired
virtual objects.
5.2 Real human virtual human interaction
In principle interactions between real and virtual
humans should be similar to interactions between real
humans. In this paper we are interested in nonverbal
interactions between real and virtual humans. Nonverbal
interactions are concemed with body postures and their
effects on other peoples’ feelings. We should provide
realistic body postures, and transitions between these
37
European Project VISTA and
Foundation for Scientific Research.
Swiss
National
References
E13 Boulic R., Huang Z., Magnenat Thalmann N., and
Thalmann D., “Goal Oriented Design and Correction
of Articulated Figure Motion with the TRACK system”,
Computers and Graphics, Pergamon Press, Vol. 18, No
4, 1994, pp. 443-452
[2] Boulic R., Capin T., Huang Z., Kalra P.,
Lintermann B., Magnenat Thalmann N., Moccozet L.,
Molet T., Pandzic I., Saar K., Schmitt A., Shen J.,
Thalmann D., “The Humanoid Environment for
Interactive Animation of Multiple Deformable Human
Characters”, Proceedings of Eurographics’95, 1995
[3] Boulic R., Becheiraz P., and Emering L.,
“Heterogeneous Actions Integration for Autonomous
Virtual Human and Avatar Animation with the
AGENTlib framework”, D. Thalmann & N. Magnenat
Thalmann Eds., to appear in 1997
[4] Gruen A., Digital close range photogrammetry:
development of methodology and systems. Chapter 4 d
the book “Close Range Photogrammetry and Machine
Vision”, Editor K.B. Atkinson, Whittles Publishing
1996
[5] Maes P., Darrel T., Blumberg B., and Pentland A.
“The ALIVE System: Full-body Interaction with
Autonomous Agents”, Proc. of the Computer
Animation’95 Conference Geneva, IEEE Press, April
1995
[6] Magnenat Thalmann N.& Thalmann D., “Digital
Actors for Interactive Television”, Proc. of the IEEE,
Vol. 83, no.7, July 1995
[7] Molet T., Boulic R., and Thalmann D., “ A Real
Time Anatomical Converter For Human Motion
Capture”, Eurographics workshop on Computer
Animation and Simulation’96, R. Boulic & G. Hegron
(Eds.), pp. 79-94, ISBN 3-21 1-828-850, SpringerVerlag Wien
[SI Rohlf J., Helman J., “IRIS Performer: A High
Performance Multiprocessing Toolkit for Real-Time 3P
Graphics”, Proc. SIGGRAPH’94, ACM Press, 1994
[9] Sederberg T.S. & Parry S.R, “Free From
Deformation of Solid Geometric Models”, Proc.
SIGGRAPH’86, ACM Press, pp. 151-160
[lo] Shen J., “Human Body Modelling and
Deformations”, PhD Thesis, LIG-EPFL, 1996
[ l l ] State A., Gentaro H., Chen D.T., Garret W.F., and
Livingston M.A., “Superior Augmented Reality
Registration by Integrating Landmark Tracking and
Magnetic Trucking”, Proc. of SIGGRAPH’96 New
Orleans, ACM Press, 1996
[12] Thalmann D., Shen J., Chauvineau E., “Fast
Realistic Human Body Deformations for Animation and
VR Applications”, Proc. Computer Graphics
International’96, Pohang, Korea, 1996
Figure 9. Interaction 1
Figure 10. Interaction 2
6 Conclusions
In this paper we described an AR system, which is
capable of letting virtual humans perform script driven
action. We described a system to solve the problems of
integration of many distinct technologies from VR to
computer vision. We presented current problems and
possible solutions for interaction between real and
virtual humans. Finally we proposed a soRware
architecture to master these problems. In near future we
will concentrate our efforts to develop a 3D computer
vision system to acquire data kom real world, and
perform interactions between real and autonomous
virtual humans.
7 Acknowledgments
The authors would like to thank Patrick Keller and
Mireille Clavien for 3D models and keyfi-amed
animation sequences, and Shrikanth Bandi for
proofreading. This research was supported by the
38
.