Download Report

JugEm: Software to help the world learn how to juggle
David McBrierty
School of Computing Science
Sir Alwyn Williams Building
University of Glasgow
G12 8QQ
Level 4 Project — January 31, 2013
Abstract
We show how to produce a level 4 project report using latex and pdflatex using the style file l4proj.cls
Education Use Consent
I hereby give my permission for this project to be shown to other University of Glasgow students and to be
distributed in an electronic format. Please note that you are under no obligation to sign this declaration, but
doing so would help future students.
Name:
Signature:
i
Contents
1
Introduction
1
2
Background Research
2
2.1
Literature Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2.1.1
What is Juggling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2.1.2
Science & Juggling: What goes around, comes around... . . . . . . . . . . . . . . . . .
3
2.1.3
Juggling Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
Technology Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.2.1
Mobile Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.2.2
Wii Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.2.3
Kinect Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
Existing Software Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.3.1
Learn to Juggle Clubs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.3.2
Wildcat Jugglers Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.3.3
Juggle Droid Lite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.3.4
Software Survey Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.2
2.3
3
4
Requirements
10
3.1
Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
3.2
Non Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
Design
12
4.1
Library Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
4.1.1
12
Kinect Software Development Kit . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ii
4.2
4.3
5
4.1.2
EmguCV Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
4.1.3
Microsoft Speech Platform 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
4.1.4
XNA Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
Game Screen Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
4.2.1
Screens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
4.2.2
Screen Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
4.3.1
Detecting Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
4.3.2
Event Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
4.3.3
Event Delegation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
Implementation
19
5.1
Kinect Data Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
5.1.1
Depth Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
5.1.2
Color Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
5.2
Depth Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
5.3
Ball Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
5.3.1
Ball Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
5.3.2
Color Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
Event Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
5.4.1
Frames and Frame Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
Processing Events and Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
5.4
5.5
6
Evaluation
29
7
Conclusion
30
Appendices
31
A Running the Programs
32
B Generating Random Graphs
33
iii
Chapter 1
Introduction
The first page, abstract and table of contents are numbered using Roman numerals. From now on pages are
numbered using Arabic numerals. Therefore, immediately after the first call to \chapter we need the call
\pagenumbering{arabic} and this should be called once only in the document.
The first Chapter should then be on page 1. You are allowed 50 pages for a 30 credit project and 35 pages
for a 20 credit report. This includes everything up to but excluding the appendices and bibliograph, i.e. this is a
limit on the body of the report.
You are not allowed to alter text size (it is currently 11pt) neither are you allowed to alter the margins.
Note that in this example, and some of the others, you need to execute the following commands the first time
you process the files. Multiple calls to pdflatex are required to resolve references to labels and citations. The file
bib.bib is the bibliography file.
1
Chapter 2
Background Research
This chapter will discuss the Background Research carried out prior to the systems design and implementation.
The Background Research carried out was completed in the form of three surveys, each which is discussed in
this chapter.
2.1
Literature Survey
The Literature Survey described in this section looks into the history of juggling and the scientific principles
behind it.
2.1.1
What is Juggling
Juggling is described as the act of continuously tossing into the air and catching (a number of objects) so as
to keep at least one in the air while handling the others. Juggling has been a past-time for many centuries, the
earliest known pictorial evidence of juggling (shown in Figure 2.1) was found in an ancient Egyptian burial site
that was used during 1994-1781 B.C [8].
Figure 2.1: Ancient Juggling Hieroglyphic
Whilst the art of juggling might have been performed for different reasons many years ago, there is little
doubt that the people depicted in the image are juggling. Juggling was also part of the first ever modern day
2
circus in the late 18th century [15]. There are many different varieties of Juggling around today; Juggling can
use many different patterns (such as the shower or also the cascade), Jugglers can use a variety of objects (such
as clubs, rings or balls) and Jugglers can work as part of a team (instead of on their own). There is even a form of
Juggling where by the ball remains in contact with the body at all times (known as Contact Juggling [22]). Over
the years, it has been used in proverbs [4], early 20th century psychological research [25] and even in robotics
[2]. At its roots Juggling is both a highly skillful and entertaining pastime, requiring an immense hand-eye coordination to master.
2.1.2
Science & Juggling: What goes around, comes around...
Whilst entertaining, the scientific aspects behind juggling have also been researched. An early mathematician by
the name of Claude E Shannon (regarded as many as the father of modern day Information Theory [1]) was also
a keen juggler. Shannon published two papers on juggling [23], one titled Claude Shannons No-Drop Juggling
Diorama which discusses a toy machine that was given to Claude of clowns juggling rings, balls and clubs
(shown in Figure 2.2).
Figure 2.2: Claude Shannons No-Drop Juggling Diorama
The other paper, entitled Scientific Aspects of Juggling, discusses a brief history of juggling and introduces
Shannons Theorem for the Uniform Juggle. A Uniform Juggle is a juggling pattern which has the following
properties:
1. There is never more than one ball in one hand at any time (commonly called mulitplexing)
2. The flight times of all balls are equals
3. The time each hand has a ball in it is equal
4. The time each hand is empty is equal
These properties, at first, may seem quite restrictive, but they are very common amongst jugglers. Many
juggling patterns follow these principles (including the three, five and seven ball cascade patterns) and these
principles can be adapted to include patterns that involve juggling under the leg, from behind the back or even
over the head juggling. Shannons theorem is as follows:
F +D
B
=
V +D
H
3
Parameter
F
D
V
B
H
Name
Flight Time
Dwell Time
Vacant Time
Balls
Hands
Explanation
Time a ball spends in flight
Time a hand spends holding a ball
Time a hand spends being empty
Number of Balls
Number of Hands*
*Shannons theorem works when the number of hands is greater than 2
This theorem seems simple; in a perfect juggle, the flight time of a ball (F) and the vacant time of a hand (V)
should be equal, the dwell time will then make up the rest of the time taken for a ball to complete a cycle.
Moreover, if a juggler holds a ball in one hand longer (D) than is expected, then they will have to start throwing
the balls quicker (by altering the flight time of the ball or the time a hand spends empty) in order to maintain a
uniform juggle.
2.1.3
Juggling Notation
Juggling also has various forms of notation used to describe patterns. There are various flavors of diagram based
notations, and a popular numerical based notation.
Ladder Diagrams
Ladder Diagrams are 2-Dimensional diagrams, with the vertical dimension representing time and the top of the
diagram (or ladder) being 0 and time increasing downwards. Ladder Diagrams have the benefits of showing time
related information about the pattern, however they are not particularly good for visualizing the actions involved
in juggling that pattern.
Figure 2.3 shows the Ladder Diagram for the 3 Ball Cascade pattern. Each ’rung’ on the ladder represents a
beat of time. Throws are represented in red and catches in green. With respect to the hands, the diagram shows
that each hand catches and throws alternately. With respect to the ball the diagram shows each ball remaining in
the hand it is caught by for one beat of time (dwell time) and being in the air for two beats of time (flight time).
Figure 2.3: Ladder Diagram of 3 Ball Cascade
Casual Diagrams
Casual Diagrams are another 2-Dimensional representation of juggling patterns. In contrast to ladder diagrams,
they show the events that lead to having to throw ball. This gives the diagram a simpler form than a Ladder
4
Diagram (which can lead to making the pattern easier to visualize).
Figure 2.5 shows the Casual Diagram for the 3 ball Cascade pattern. The diagram shows that when a throw
is made from the left hand, the next action needed is to throw from the right hand (and so forth). The arrows in
the diagram are used to show which throws cause other throws.
Figure 2.4: Casual Diagram of 3 Ball Cascade
Siteswap Notation
Siteswap notation is a numerical representation of a juggling pattern. The person who invented Siteswap notation
is still disputed, as three very similar notations were invented independently around the early 1980’s, the first by
Paul Klimek in 1981 [13], the second by Bruce Tiemann in 1985 [7] and the third by Mike Day, Colin Wright,
and Adam Chalcraft, again in 1985 [11]. Over the years there have been various adaptations to Siteswap so that
it can represent various forms of juggling including multiplex patterns (having more than one ball in a hand at
one time), using more than two hands or using passing patterns.
Given the inherent mathematical complexity and variety involved in Siteswap notation, this section will not
fully describe how Siteswap notation works. For the remainder of this report, it is enough to know that each
number in the Siteswap pattern relates to how long the balls is in the air for.
The Siteswap notation for a 3 Ball Cascade is 3333333... (which is often shortened to just 3). The Siteswap
notation for a 3 Ball Shower pattern is 51 (as in this pattern one ball is thrown high, and another quickly below
it).
2.2
2.2.1
Technology Survey
Mobile Devices
The software could also be written for a mobile device (mobile phone, tablet etc.). Many modern day mobile
devices are equipped with various types of sensors that can be used to track the devices position and motion.
Table 2.2.1 outlines some of the sensors available on the two most common mobile device types available today
(Android and iOS).
5
Device
Mobile Device
Platform
Android
Mobile Device
iOS
Games Console
Nintendo Wii
Desktop Application
Kinect Sensor
Sensors
Accelerometer, Camera, Light
Sensor, Gyroscope, Gravity*,
Proximity*, Rotation*
Accelerometer, Camera, Light
Sensor, Gyroscope*
Accelerometer, PixArt Optical
Sensor, Infrared light sensor
Camera, Infrared Depth Sensor,
Microphone Array
Cost
Medium
API
Open
High
Restricted
Medium
Open Source
Medium
Open
* only available on certain models of the device
Mobile Device Sensors
Both Android and iOS mobile devices contain sensors that could be used to track users hands movements (or
perhaps even using the device as a ball that is being juggled). Android devices tend to have a wider variety of
sensors on each device[10], however both devices types share the same common sensors.
One of the main concerns that separates the two devices is the API and Cost. The Android Software Development Kit (more commonly called an SDK) is based on the Java platform, and so runs on both Unix and
Windows machines, where as the iOS SDK is limited to running only on Mac platforms, and uses the Objective
C programming language. These restrictions for the iOS development platform (coupled with the fact the cost of
such devices is particularly high), meant that this platform was not considered for this system.
Given the cost of devices and the good availability of the Android SDK, two potential systems were considered for developing the tutorial software on a mobile device.
Juggling the actual device(s)
Data could be gathered from a device(s) Accelerometer on the x, y and z axes. This data could then be used
to determine an approximate trajectory that a device has taken when thrown, however this method has various
impracticalities. Firstly, expecting a user to juggle multiple mobile devices at the same time will undoubtedly
lead to the devices being dropped or damaged. Secondly, (assuming the devices were unbreakable and in abundant supply) the devices would need to somehow communicate (either with each other or all communicating to
another device), and finally, if using only a single device, the software would be unable to distinguish between
juggling one item and or juggling one hundred items.
Strapping devices to a user’s hand(s)
Another way the mobile devices could be used is by strapping them to the user’s hands. When juggling, a
user should throw with their elbow and maintain their hands in a level position. This would allow a user juggling
to be tracked using their hands position. This could be achieved by strapping a mobile device to a user’s hands
in order to record their position. Ideally, the user would need two devices (one for each hand), this again would
present eh problem of the devices communicating with each other in some manner. However one device could be
used to track the motion of one hand. Regardless of the number of hands that are tracked, this method does not
allow for gathering data on the trajectories of the items being juggled or even how many items are being juggled.
One of the primary issues with using Accelerometer data when working out positions is that it is highly inaccurate. David Sachs gives an interesting talk discussing this issue [6]. If either of the methods were used
6
above, retrieving the position of the device in order to track the users hands would certainly be desired. To get
position data from the mobile device, the accelerometer should be integrated twice (following basic mathematical principles). The drift associated with single integration is a problem, so the drift in the result involved with
double integration is much higher. For this reason, working with mobile devices was ruled out, as the results and
mathematics behind the system can prove unpredictable.
2.2.2
Wii Controllers
Previous projects have also attempted to make juggling systems using Wii Controllers. Wii Remotes contain an
Accelerometer, a PixArt Optical Sensor and an Infrared light sensor. These sensors can be combined to track a
users hands whilst they are juggling.
Nunez et al have previously devised a system that does this very task[14]. They use the Wii Remotes to track
the users hand position and movements and use this data to juggle balls in a 3D Virtual Environment. They
also use the Wii Remotes rumble packs to provide feedback to the user. In their paper they highlight the fact
that the IR Camera on the controller is particularly good when tracking multiple objects, and also that a similar
system using the Kinect IR Camera would also work, however their project focuses on being light-weight and
cost effective, given the higher price for Kinect Sensor it is obvious that their project would be benefited by using
the Wii Controller.
WiiSticks [27] is another example of a project that tracks user’s hands whilst juggling a single baton (or stick).
2.2.3
Kinect Application
The Kinect Sensor is a Microsoft device that was released in 2010 for the XBox 360 games console that allows
users to interact and control their games console using gestures and spoken commands.
Figure 2.5: Microsoft Kinect Sensor
The Kinect Sensor is a device containing 3 sensors that allow it to track motion. It contains an RGB Camera
that captures color images (at a 1280x960 pixel resolution), an Infrared Depth Sensor that measures depth information between objects and the sensor, and a 24-bit Microphone Array (containing 4 microphones) that allow
digital audio capture [17].
One of the primary obstacles when using Kinect is that the Sensor and SDK do not include libraries for
tracking objects. Currently, the Kinect sensor can only track Skeletal Bodies (such as a human), however it
is capable of tracking objects if we create or reuse a third party library. Various attempts have been made by
7
third parties on this very subject, and most involve combining the depth and color inputs and performing varying
amounts of image processing to filter out noise and find objects. Once these objects are found various methods
can be used to track or calculate the trajectory of the object.
2.3
Existing Software Survey
This section will discuss existing software systems that aim to help people learn how to juggle (in one form
or another). During this research, it was very apparent that there are very few interactive software systems for
teaching people to juggle. Most take the form of animations or other forms to teaching people some aspect of
juggling. Some of the better software systems will be discussed in this section.
2.3.1
Learn to Juggle Clubs
Learn to Juggle Clubs is an Android application that provides the user with video tutorials on how to juggle with
clubs [9]. The application provides a total of four videos (from juggling a single club to juggling them behind
the back and under the leg), which describe, in rather brief detail, how a user should juggle clubs. Even although
the application focuses mainly on clubs, some of the points it raises would be valid for a user learning to juggle
with balls (or other objects).
The application itself seems to just be a web page that shows the videos and allows the users to view them.
This method of teaching is not particularly interactive, but such an application could certainly be used to learn
some of the principles of club juggling.
2.3.2
Wildcat Jugglers Tutorials
Wildcat Jugglers group provide a far more extensive set of video based tutorials [3] for people wishing to learn
juggling patterns. Their website contains 74 tutorials, each one for a different trick or pattern. The tutorials come
in the form of web based videos, with accompanying text to talk the user through how the pattern should be
performed.
The website also makes reference to another web based set of tutorials from the Tunbridge Wells Juggling Club
[26] which contains even more patterns, however the Tunbridge Wells Clubs website only contains text based
description of how to perform patterns. Together, both the Wildcat videos and Tunbridge Wells descriptions
could prove a useful reference for beginners (and even more advanced) juggling tips.
2.3.3
Juggle Droid Lite
Juggle Droid Lite is an Android application for visualizing Siteswap notation [24]. It allows users to input different Siteswap patterns and will show an animation of a stick person juggling balls in the pattern. The application
also contains a list of over 200 existing Siteswap patterns that allow users to see the different parts of a pattern
being animated.
Whilst beginners might not fully understand Siteswap notation, the patterns that come with the application are
all sorted and given explanatory names, allowing beginners to see the patterns in action. The application also
allows users control over the speed of the juggle by tweaking the beats per minute and the dwell time associated
with the patterns.
8
2.3.4
Software Survey Conclusion
The software discussed in this section is not exhaustive. Various other tutorial websites are available on the Internet, however they are mostly either video or text descriptions of how to juggle. The Siteswap notation visualizers
are also very common when it comes to juggling related software, whilst they might be slightly more complicated
that a first time juggler is looking for, they could no doubt prove useful.
Whilst these sites and applications provide a good amount of tips and tricks for beginning jugglers, they do
lack interactivity with the user. Watching videos and reading descriptions of what to do is all well and good, but
juggling is skill that is most definitely learned by way of participation. The JugEm system will attempt to bridge
this gap by providing the user with a more immersive way of learning how to juggle.
9
Chapter 3
Requirements
The Requirements chapter will outline the required functionality and behavior of the system (functional requirements) as well as the constraints and qualities that the system should have by nature (non functional requirements).
The requirements for the system were gathered through meetings and discussions with the client about the
functionality and behavior that would be useful for users wishing to learn how to juggle.
3.1
Functional Requirements
When capturing the Functional Requirements, the MoSCoW method was used. The MoSCoW rules are a typical scheme for prioritizing requirements. They allow a software systems requirements to be separated into four
distinct categories. Each category reflects a priority for the requirement. The four distinct categories of the
MoSCoW scheme are given in the list below
Must Have The system will not be successful without this feature
Should Have This feature is highly desirable
Could Have This feature is desirable if everything else has been achieved
Would Like To Have This feature does not need to be implemented yet
Using the MoSCoW method allows the requirements to be prioritized, and, before any development has started,
allows consideration and estimations for what will be achievable by the system to be developed and also what
can be considered to be out with the scope of the system.
10
Must Have
Should Have
Could Have
Detect user throwing a
ball
Calculate Shannon
ratio for a juggle
Tasks to improve
timing
Detect user catching a ball
Suggestions
on
improving juggling
technique
Detect user dropping a ball
Ability to detect 4
ball patterns
Tasks to improve
throwing
Detect a peak of a ball
Track users hand positions
Provide report on a juggling sessions
Ability to detect 1,2 and 3
ball patterns
Would Like To
Have
Ability to define
patterns
using
siteswap notation
The Requirements gathering process was an interative one. Over the course of meetings with the client, new
requirements were added, removed and priorities changed as the project progressed. Details on which of these
Requirements were met by the system are provided in the Evaluation chapter.
3.2
Non Functional Requirements
The non functional requirements are used to describe constraints and qualities that are possessed by the system.
Typically Non-Functional Requirements are harder to gather than Functional ones, however the purpose of considering the Non Functional Requirements is to improve the overall system. The Non Functional Requirements
are listed below.
• The system has been designed and tested to run on a standard Intel Desktop PC running Windows 7 or
later. The system may run on earlier versions of Windows, however it is not guaranteed.
• The system uses a standard XBox 360 Kinect Sensor as input.
• When using the system, juggling balls of different solid colors should be used (Red, Green and Blue) and
they should typically be around 65mm in diameter.
• The system is specifically designed to be run in a well-lit spacious environment (of around 3m2 )
11
Chapter 4
Design
This chapter outlines the high-level design strategy and architectural patterns applied to the JugEm system.
4.1
Library Support
This section will discuss the various software libraries that are used in the JugEm system.
4.1.1
Kinect Software Development Kit
The Kinect for Windows Software Development Kit [16] provides the system with all the necessary classes and
methods so that it can inter-operate with a Kinect Sensor. The Kinect SDK is recommended to be run using a
Kinect Sensor for Windows, however the libraries still work with an XBox Kinect Sensor (however, the extras
features that are unique to the Windows Kinect Sensor are not available).
The Kinect SDK provides the drivers necessary to use the Kinect Sensor on a windows based machine, all
the necessary APIs and device interfaces along with Technical Documentation and some source code samples.
From these APIs, the system is able to retrieve three things:
Infrared Depth Data This data gives the distance from the sensor that the surroundings it can see are
(in millimeters).
RGB Images This data gives an RGB image of what the Kinect Camera can see.
Skeleton Information This provides a list of the potential human bodies the Sensor can see in the surroundings.
Audio Capture The Microphone Array in the Kinect Sensor can capture audio from the surroundings.
The data provided by the Kinect Sensor is the primary source of data, the system will then process this data in
order to work out when a user is in front of the Kinect Sensor juggling. The Skeleton data that is provided by
the APIs only provides the positions of users joints (the joints tracked by the Kinect Sensor are given in Figure
4.1). The system uses this data to get the position of the users wrists and then performs various operations on the
Color and Depth data in order to do Object Detection and detect Events (these operations are discussed in detail
in the Implementation chapter).
12
During development, a new version of the Kinect SDK was released (from v1.5 to v1.6) [18], this involved
minor changes to the system during development so that it operated with the latest version of the SDK.
Figure 4.1: Kinect Skeleton Joints
4.1.2
EmguCV Image Processing
For the system to detect the juggling balls that the user is throwing and catching, various types of image processing were carried out on the data from the Kinect Sensor. In order to the this image processing, the EmguCV
library is used by the system. EmguCV is a cross platform wrapper library that gives various .NET languages
(VC++, C#, Visual Basic and others) access to the OpenCV image processing library. The OpenCV library is a
free, open source image processing library developed by Intel [12].
Using the EmguCV wrapper gives the system access to the classes and methods it needs to perform:
Depth Image Thresholding This method is used to remove all the depths from the Depth Image that are greater
than a certain distance.
Contour Detection This method allows the system to detect the contours present in the Depth Image (after
filtering).
Color Image Conversion and Filtering These methods are used to convert the Kinect Sensor’s Color Image
into the Hue, Saturation and Value color space and filtering it to perform color detection.
4.1.3
Microsoft Speech Platform 11
In order to allow the system to recognize the voice commands to tell it to start and stop recording a juggling
session, the system has used the Microsoft Speech Platform [21]. The Microsoft Speech Platform provides the
system with an API that allows the use of redistributable Grammars and Voice Recognition methods that allow
the system to recognize voice commands using the Kinect Sensor’s microphone array.
13
4.1.4
XNA Framework
The program is developed using the XNA Framework 4.0 from Microsoft. This framework provides the program
with all the necessary run time libraries and classes to develop games for mobile devices, X-Box Arcade and the
Windows platforms. The main class provided by this framework is the Game class, the class diagram for this
class is given Figure 4.2.
Figure 4.2: XNA Game Class
Every program that uses the XNA Framework contains a class that is an extension of the Game class. The
game class contains various properties and methods, the most important ones are outlined below.
TargetElapsedTime This property is the target frame rate of the program. The lower this value, the more often
the Games classes methods will be called.
Initialize This method is called once at the start of the game, it allows the program to set up any variables it
needs (for example it can setup the Kinect Sensor).
LoadContent This method is called before the first frame to allow the program to set up and graphics, audio or
other game content that the program may need.
Update This method is called for every frame, and is used to update any variables, state etc that the program
needs to update.
Draw This method is also called for every frame, and is used to render all the necessary graphics to the user’s
screen.
The class in JugEm that extends this Game class is the JugEmGame.cs class. On top of this class there is
a Game State Management framework, which also contains classes with the methods mentioned above. This
framework is described below.
14
4.2
Game Screen Management
As the program contains various different types of screens (for example, menu screens, report screens and game
screens), a framework was used to provide a sound architecture for development. The framework used was
adapted from a piece of code developed by Microsoft for this purpose. REFERENCE. Using the method discussed in this chapter allowed the program to be neatly separated into various components that are displayed to
the user and updated accordingly. The basic outline of the architecture provided by this framework is given in
Figure 4.3.
Figure 4.3: Game Screen Management Architecture
4.2.1
Screens
The main abstract class provided in this framework in the GameScreen class. Every type of screen used in
the program is a subclass of GameScreen, which provides various methods and properties as outline in Figure
4.4. The GameScreen class contains some similar methods to the previous methods discussed for the XNA
Game class. These methods perform the same functions as they would in a subclass of Game class, allowing
each screen in the game to have different work done in their own Update and Draw methods. The ScreenState
enumerated type is used by the Screen Manager to determine which screen to display and also provide simple
transition effects when a screen becomes visible or is closed.
The MenuScreen class provides various methods and variables that allow Screens to contain various types
of Menus within them. These screens are used in the program to allow the user to specify options and choose
which type of Juggling training they would like to try. The GameplayScreen class is used for the more important
Screens used in the program, such as the mini games and main game-play screen.
4.2.2
Screen Management
In order to manage the all the screens within a game, the framework provides a class named ScreenManager.
The main game class contains an instance of a ScreenManager, which each class in the game is able to access.
This allows screens such as the options screen to remove itself and add a new screen (for example when the user
15
Figure 4.4: GameScreen Class Diagram
leaves the options screen to re turn to the main menu). The ScreenManager stores all of the screens that are
currently open (because some may be hidden), and screens which are to be updated 9as there may be more than
one requiring updating), this enables the ScreenManager class to forwards the updates to the relevant Screen so
that it can carry out it’s own work. The ScreenManager class diagram is shown in Figure 4.5.
Figure 4.5: ScreenManager Class Diagram
4.3
Events
In order to identify when a user is juggling in front of the Kinect Sensor, the program processes the information
available to it and throws Events based on what the user is doing in front of it. The main types of Events that the
16
program detects are discussed in this section.
4.3.1
Detecting Events
In order to detect to detect events various calculations are done with regards to the users hands the juggling balls
that have been detected. The areas around the users hand are separated into two distinct areas, the Hand Zone
and the Throw Zone.These zones are shown in Figure 4.6
Figure 4.6: Zones around one of the user’s hands
The Hand Zone is used so that any juggling ball that has been detected within this area can be ignored. If the
Hand Zone is not ignored, then the system will, from time to time, detect the user’s hands or fingers as juggling
balls which is undesirable. The system is still able to track approximately how many juggling balls a user has in
his hands, as the system will know how many balls are being used and, in a perfect scenario, know exactly how
many catches and throws have taken place for each hand.
The Throw Zone is used to detect both ThrowEvents and CatchEvents. A ThrowEvent occurs when
a juggling ball is seen inside the Throw Zone and is then seen outside the Throw Zone in a subsequent Frame.
In contrast a CatchEvent occurs when a juggling ball is seen outside of the Throw Zone, and then inside the
Throw Zone in a subsequent Frame.
Each of the user’s hands is surrounded by these Zones so that throws and catches can be seen for both hands.
4.3.2
Event Hierarchy
Due to the similarities present between the different types of Events, the hierarchy shown in Figure 4.7 was used.
The base class (Event) contains the two variables that every single Event must have. These are a position (on
the screen) and a time the Event was seen (used for Shannon’s Uniform Juggling equation). This base class is
then sub-classed into one of two possible classes, a HandEvent or a BallEvent.
A HandEvent is an Event that involves one of the user’s hands (for example throwing or catching a ball) and
the HandEvent class provides a variable in which to store which hand was involved in this Event (using the Hand
enumerated type).
A BallEvent is an Event that involves only a Ball. A PeakEvent covers the situation where a ball changes its
vertical direction from up to down, and a DropEvent covers when a ball is seen below the lowest of the user’s
17
Figure 4.7: Events Class Diagram
hands. Both of these Events do not need any information regarding what hands are involved in the Event, they
are only concerned with the position of the juggling ball when the Event is detected.
4.3.3
Event Delegation
Due to the method by which each Frame is seen and processed by the program, it is necessary for the handling
of these Events to be delegated to a class other than the one they are detected in. This is because different mini
games will handle events in different ways. This allows each mini game to have complete control over what work
must be done when a juggling Event is detected. To ensure that the Events that are created are accurately timed,
a .NET Stopwatch object is used, as this type of timer has very high precision (of order nanoseconds) [20], as the
program uses milliseconds when timing Events this was deemed a suitable choice.
18
Chapter 5
Implementation
This chapter describes the low-level implementation details for the JugEm system, focusing particularly on key
software engineering challenges and the solutions that were devised for them. Firgure 5.1 shows the sequence of
data processing tasks implemented in the JugEm system. The rest of this chapter describes each task in detail.
Figure 5.1: Pipeline of Data Processing Tasks
5.1
Kinect Data Input
This section will discuss the raw data that is receieved from the Kinect Sensor.
5.1.1
Depth Data
The raw depth data that is retrieved from the Kinect Sensor is of the form DepthImagePixel[]. This array
is a 2 dimensional array (similar to the one as shown in Figure 5.2) flattened into a 1 dimensional array.
Figure 5.2: 2D Array
Each index into this array stores a DepthImagePixel which stores two pieces of information, the distance
from the sensor of this pixel (depth) and whether or not the Kinect Sensor considers this pixel to be part of a
19
player. If the Kinect Sensor considers this pixel to be part of a player, the PlayerIndex field will be set to a
number between 1 and 7 (other wise it will be 0). The Kinect Sensor can only track a maximum of 8 players at
any given time. The structe of each DepthImagePixel is given in Figure 5.3.
Figure 5.3: Depth Pixel Strucutre
Prior to the Kinect SDK v1.6 upgrade, the depth data was provided in the form of a short[], the same
information was contained within this array, however bit shifting was required in order to separate the fields. The
Kinect SDK v1.6 introduced the new class of DepthImagePixel, which has the structure mentioned above,
and due to its class level members, eliminates the need to bit shift.
5.1.2
Color Data
The Kinect Sensor provides its raw color data in a similar way to the depth data. It provides the data in the form
of a byte[], which is a flattened 2D array (similar to the array discussed previously for depth data). Each pixel
in the color image from the camera is represented as four bytes in the flattened array (as shown in Figure 5.4)
Figure 5.4: Color Pixel Structure
These values represent the Red, Green, Blue and Alpha Transparency values for each pixel in the image. As
the Kinect Sensor supports various different image formats, this format can be different depending on the format
chosen, the JugEm system uses the RGB format (with a resolution of 640x480). This means that the size of the
byte[] will be (width ∗ height ∗ 4) and the index of pixel (x, y) will be at indices ((x ∗ y ∗ 4) to (x ∗ y ∗ 4) + 3)
incluvsive.
In order to process this information in an appropriate manner, the system carries out four main tasks, which
are disccussed in the remainder of this chapter.
5.2
Depth Image Processing
On each call to the current game screens Update method, the latest depth data is copied into an array of
DepthImagePixels. Each element in this array stores the distance that each pixel in the image is from
the sensor. The Kinect Sensor hardware contains algorithms that run Skeletal Recognition on the the data it
generates. As a result, each element in the DepthImagePixel[] also contains whether or not the hardware
considers that pixel to be part of a player or not. During initial prototypes that were built to work with the Kinect
Sensor, it was found that the Sensor regards a ball that has been thrown from a players hand to be part of the
player (even after it has left the hand). Given this fact, the depth data that is retrieved from the sensor at each
20
Update call is filtered to remove any pixel that the sensor does not consider to be part of a player. The results
of carry out this kind of filter are given in Figure 5.5.
Figure 5.5: Filtering player from depth data
Figure 5.5 shows (from left to right), the color image from the Sensor, the raw depth data from the Sensor,
the raw depth data converted to a gray-scale image, and finally the result of removing all non player pixels from
this image.
The method that is responsible for this filtering is given in the following Code Listing. The Helper.GetDepths
method uses a parallel for loop to iterate over all the DepthImagePixels in the DepthImagePixel[] that
has been retrieved from the sensor. It calculates the the index in the 1 dimensional array (line 6) using the 2
dminesional coordinates for the pixel in the image.
The DepthImagePixel at this index is then checked to see if it contains a player or not (lines 10 -13). If
this depthImagePixel contains a player, then the value of the depth is passed to the
CalculateIntensityFromDepth method, this method is used to smooth the color that is applied to the
player pixels in the depth data. The smaller the depth from the Sensor, the more white the pixel will appear (as
can be seen in the final image in Figure 5.5).
The byte[] that this method returns is used to create the grayscale image shown at the end of Figure 5.5.
1
3
5
7
public s t a t i c byte [ ]
int height )
{
/ / width
Parallel
{
for
{
GetDepths ( b y t e [ ] depths , DepthImagePixel [ ] d e p t h P i x e l s , i n t width ,
s h o u l d be 640 s o b e s t t o p a r r a l e l t h e w i d t h a r e a
. F o r ( 0 , w i d t h , i =>
( i n t j = 0 ; j < h e i g h t ; j ++)
i n t rawDepthDataIndex = i ∗ h e i g h t + j ;
9
i f ( d e p t h P i x e l s [ rawDepthDataIndex ] . PlayerIndex > 0)
depths [ rawDepthDataIndex ] = C a l c u l a t e I n t e n s i t y F r o m D e p t h (
d e p t h P i x e l s [ r a w D e p t h D a t a I n d e x ] . Depth ) ;
else
depths [ rawDepthDataIndex ] = 0;
11
13
}
}) ;
return depths ;
15
17
}
src/GetDepths.cs
The resulting image from Figure 5.5 then has a threshold value applied to it to remove any players pixels that
are further away than a certain threshold distance. This thresholding is done in the EmguCV library, the formula
it uses is as follows:
21
dst(x, y) =
0
maxV al
if src(x, y) > thresholdV alue
otherwise
This threshold value is calculated on each Update call, and is set to be the depth of a players wrists (the
depth of which is obtained from the Skeleton data that can be retrieved from the Kinect Sensor). The resulting
image for this thresholding is shown in Figure 5.6.
Figure 5.6: Applying threshold to filtered depth data
This image (represented as a byte[]) is filtered in line 5 of the code listing in 5.3.1. This method is passed
the current threshold value (from the players wrist depth) and it makes a call to an EmguCV method called
cvThreshold. The cvThreshold method runs a Binary Inverse Threshold on the image, meaning that and pixel
value that is greater than the threshold is set to 0 and any pixel value that is less than the threshold is set to black
(as the maxValue parameter passed to the method is 255).
5.3
5.3.1
Ball Processing
Ball Detection
After the depth data from the Kinect Sensor have been filtered to produce the image in 5.6, the DetectBalls
method is able to detect the juggling balls from the image. This done using EmguCV library, which runs Canny
Edge Detection on the image. The formula behind Canny Edge detection is discussed in John Canny’s paper on
the subject [5]. The Canny edge detection is carried out on line 6 of the following code listing.
The resulting edges are then processed to try and identify juggling balls and ignore things that are not wanted.
If any of the edges are found to be inside the Hand Zone (as discussed in the Design section), then they are
ignored. This is to try and prevent the system from incorrectly detecting the users hands as juggling balls. This
is carried out in lines 17 - 24 of the code listing.
The DetectBalls method is coded to find juggling balls of a specific radius. It does this using a specific radius value to find, and a threshold value that the radius can be within (named expectedRadius and
redThres respectively). For example a radius of 3 and a threshold of 2, would find circles of radius 1 through
5. Any circles that are found to be within these values are deemed to be juggling balls. Line 27 is used to ensure
that only circles with these radii are processed further.
In order to correctly detect juggling balls in flight, it is not enough to just try and find circular edges that have
been detected. When a juggling ball is in flight it can appear to the program as more of a capped ellipse shape.
The edges are processed to find all circles and capped ellipses (checked in line 29), and circles detected very
close to the users hands are ignored.
22
Any edges that pass this processing are added to the List<JugglingBall> which is returned by the
method (line 45). All the properties that can be calculated for the juggling ball that has be found are set and the
JugglingBall is added to the list (lines 34-39).
1
3
5
7
9
11
13
p u b l i c L i s t <J u g g l i n g B a l l > D e t e c t B a l l s ( b y t e [ ] depthMM , i n t t h r e s h o l d D e p t h , i n t w i d t h , i n t
h e i g h t , V e c t o r 2 LHPos , V e c t o r 2 RHPos , S t o p w a t c h gameTimer )
{
.......
CvInvoke . c v T h r e s h o l d ( d s m a l l . P t r , depthMask . P t r , d e p t h B y t e , 2 5 5 , Emgu . CV . CvEnum . THRESH .
CV THRESH BINARY INV ) ;
.......
Image<Gray , Byte> e d g e s = d e p t h M a s k B l o c k . Canny ( 1 8 0 , 1 2 0 ) ;
.......
/ / f o r each contour found
f o r ( C o n t o u r <System . Drawing . P o i n t > c o n t o u r s = e d g e s . F i n d C o n t o u r s ( ) ; c o n t o u r s ! = n u l l ;
c o n t o u r s = c o n t o u r s . HNext )
{
/ / c a l c u l a t e t h e b o u d i n g box x and y c e n t e r
i n t b b x c e n t = c o n t o u r s . B o u n d i n g R e c t a n g l e . X + c o n t o u r s . B o u n d i n g R e c t a n g l e . Width / 2 ;
i n t bbycent = c o n t o u r s . BoundingRectangle .Y + c o n t o u r s . BoundingRectangle . Height / 2;
b y t e bbcentVal = depthMaskBlock . Data [ bbycent , bbxcent , 0 ] ;
15
/ / make s u r e t h a t t h e b a l l i s n o t t o o c l o s e t o t h e hand p o s i t i o n s g i v e n
i f ( H e l p e r . P o i n t N e a r H a n d ( RHPos , b b x c e n t ∗ 2 , b b y c e n t ∗ 2 , t h r e s h o l d D e p t h ) )
{
continue ;
}
e l s e i f ( H e l p e r . P o i n t N e a r H a n d ( LHPos , b b x c e n t ∗ 2 , b b y c e n t ∗ 2 , t h r e s h o l d D e p t h ) )
{
continue ;
}
.......
/ / i f t h e r a d i u s f o u n d i s b e t w e e n e x p e c t e d r a d i u s and t h r e s h o l d s
i f ( ( a c t u a l R a d i u s < e x p e c t e d R a d i u s + r a d T h r e s ) && ( a c t u a l R a d i u s > e x p e c t e d R a d i u s −
radThres ) )
{
/ / i f a r e a i s g r e a t e r than or e q u a l s too capped e l l i p s e
i f ( c o n t o u r s . Area >= box . s i z e . Width ∗ box . s i z e . H e i g h t ∗ Math . P I / 4 ∗ 0 . 9 )
{
/ / s e t t h e l e f t and r i g h hand d i s t a n c e t o s a v e r e c a l c u l a t i n g i t l a t e r
/ / and s o i t i s s e t f o r e v e r y b a l l r e l a t i v e t o i t s c u r r e n t f r a m e
f l o a t b a l l L H D i s t a n c e = V e c t o r 2 . D i s t a n c e ( new V e c t o r 2 ( b b x c e n t ∗ 2 , b b y c e n t ∗ 2 ) ,
LHPos ) ;
f l o a t b a l l R H D i s t a n c e = V e c t o r 2 . D i s t a n c e ( new V e c t o r 2 ( b b x c e n t ∗ 2 , b b y c e n t ∗ 2 ) ,
RHPos ) ;
17
19
21
23
25
27
29
31
33
35
37
J u g g l i n g B a l l b a l l = new J u g g l i n g B a l l ( new V e c t o r 2 ( b b x c e n t ∗ 2 , b b y c e n t ∗ 2 ) ,
a c t u a l R a d i u s , b a l l L H D i s t a n c e , b a l l R H D i s t a n c e , gameTimer . E l a p s e d M i l l i s e c o n d s
);
39
b a l l s . Add ( b a l l ) ;
}
41
}
}
43
}
s t o r a g e . Dispose ( ) ;
return balls ;
45
}
src/DetectBalls.cs
23
5.3.2
Color Detection
Once the balls have been located in the depth data and returned in the form of a list, the list is processed so
that each JugglingBall’s color can be determined. In order to do this, the programs makes use of the color
data produced by the Kinect Sensor. As the juggling balls are detected in the depth data, their location must be
translated to the color data (as they may be different resolutions). A method for doing this is provided in the
Kinect SDK [19], however, it sometimes returns a sentinel value if the position cannot be translated. Providing
the position can be translated, the color data from the Kinect Sensor can be processed to find out the color of the
ball.
To detect the color of a ball, the program takes a small area of pixels around the JugglingBalls center
point (which was set by the DetectBalls method), and works out the average color of these pixels. In initial
implementations of the program, color detection was done in the Red, Green and Blue color space. This involved
just directly using the information received from the Kinect Sensors color data. However, using this color space
proved to be inaccurate, and an alternative was sought. In order to test out methods for doing this color detection,
a small rapid prototype was built to test out various methods of image processing that could result in more
accurate color detection.
This rapid prototype found that in order to detect the color more accurately, using the Hue, Saturation and
Value color space was far more accurate. This method has the caveat that it means a slight increase of the
amount of processing that is done in the program, however the gains it gave accuracy far outweighed the minor
hit in processing (especially considering the EmguCV library was available to the program to so the processing).
Using the HSV color space means that the program requires a minimum and maximum value for each of the
color channels (Red, Blue and Green), as it filters the image in all three colors and takes the average pixel color
of the pixels (the higher the value the more of that color is present in the ball).
A screenshot of the images retrieved from the prototype is given in Figure 5.7, and shows the original image (on the left) filtered three times (once for each of Red, Green and Blue). Unfortunately this color detection
method is very much light dependant. The values used for this filter are only really applicable in the surroundings
that the original image was taken. It can also be noted form the images that, in some surroundings, it is harder to
detect certain colors (in the images, Red detection isnt great), this is somtimes counter balanced by the fact that
the other colors show up better and can eliminate this issue (this is not always the case though).
Figure 5.7: HSV Filter Prototype
The color detection algorithm uses EmguCV methods, and uses the Hue, Saturation, Value colorspace. The
method creates a color image from the color data taken from the Kinect Sensor (in the pixels byte[], line
4). This image is then set to be only the area around the ball that we are interested in(line 6, this helps to reduce
the amount of work the system has to do when working with this image later in the method) and creates a Hue,
Saturation, Value image from the result (line 8). The method creates three blank masks so that the results of each
filter can be processed separately (lines 10-12).
This image is then filtered three times (lines 21-23); one showing only Red pixels, one showing only Blue
24
pixels and one showing only Green pixels. The average pixel value of each of these three images is taken, and
the image with the highest average is chosen to be the color of the ball (lines 26-47).
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
p u b l i c S t r i n g DetectBallColorHSV ( b y t e [ ] p i x e l s , i n t width , i n t h e i g h t , i n t ballX , i n t
ballY )
{
/ / c r e a t e a c o l o r image r e p r e s e n t a t i o n o f t h e p i x e l d a t a
bpImg . B y t e s = p i x e l s ;
/ / S e t o u r R e c t a n g l e o f I n t e r e s t a s we d o n t c a r e a b o u t most o f t h e image
bpImg . ROI = new System . Drawing . R e c t a n g l e ( b a l l X − 8 , b a l l Y − 8 , 1 5 , 1 5 ) ;
/ / c o n v e r t t h e image t o t h e Hue , S t a u r a t i o n , V a l u e c o l o r s p a c e
hsvImg = bpImg . C o n v e r t <Hsv , Byte >() ;
/ / c r e a t e t h e r e d , g r e e n and b l u e masks f o r e a c h f i l t e r
redMask = hsvImg . C o n v e r t <Gray , Byte >() . CopyBlank ( ) ;
greenMask = hsvImg . C o n v e r t <Gray , Byte >() . CopyBlank ( ) ;
blueMask = hsvImg . C o n v e r t <Gray , Byte >() . CopyBlank ( ) ;
/ / c r e a t e the range v a l u e s f o r each f i l t e r
MCvScalar redMin = new MCvScalar ( 1 1 8 , 2 3 9 , 0 ) ;
MCvScalar redMax = new MCvScalar ( 2 2 9 , 2 7 1 , 1 9 8 ) ;
MCvScalar g r e e n M i n = new MCvScalar ( 0 , 0 , 2 5 ) ;
MCvScalar greenMax = new MCvScalar ( 1 3 7 , 1 2 1 , 1 0 8 ) ;
MCvScalar b l u e M i n = new MCvScalar ( 8 4 , 1 2 0 , 0 ) ;
MCvScalar blueMax = new MCvScalar ( 1 5 6 , 2 7 1 , 1 8 9 ) ;
/ / apply each of the f i l t e r s
CvInvoke . c v I n R a n g e S ( hsvImg , redMin , redMax , redMask ) ;
CvInvoke . c v I n R a n g e S ( hsvImg , greenMin , greenMax , greenMask ) ;
CvInvoke . c v I n R a n g e S ( hsvImg , blueMin , blueMax , blueMask ) ;
/ / c a l c u l a t e the average p i x e l c o l o u r of each channel
/ / t h e h i g h e r t h e a v e r a g e t h e more l i k e l y t h a t we a r e t h a t c o l o u r
i n t blueAvg = 0 ;
f o r e a c h ( b y t e b i n blueMask . B y t e s )
{
blueAvg += ( i n t ) b ;
}
blueAvg = blueAvg / blueMask . B y t e s . L e n g t h ;
32
i n t redAvg = 0 ;
f o r e a c h ( b y t e b i n redMask . B y t e s )
{
redAvg += ( i n t ) b ;
}
redAvg = blueAvg / redMask . B y t e s . L e n g t h ;
34
36
38
40
i n t greenAvg = 0 ;
f o r e a c h ( b y t e b i n greenMask . B y t e s )
{
greenAvg += ( i n t ) b ;
}
greenAvg = greenAvg / greenMask . B y t e s . L e n g t h ;
42
44
46
r e t u r n H e l p e r . C o l o r C h o o s e r ( redAvg , greenAvg , blueAvg ) ;
48
}
src/DetectBallColor.cs
The color of the ball is set for each JugglingBall that is found, these changes mean that the list of balls
that has been generated by DetectBalls method, now all have position, location and a color associated with
them (as well as their distance to both of the users hands).
25
5.4
Event Detection
5.4.1
Frames and Frame Processing
Once the Depth Image has been process and the Juggling Balls have been detected and assigned a color, they are
added into a Frame to be processed for detecting Events. The Frame class contains all the necessary information
to identify a user juggling.
Each Frame is placed into a FrameList class, and as it is added to the FrameList and is compared to the
previous Frame to detect the Events associated with the program.
2
4
c l a s s Frame
{
/ / l i s t of j u g g l i n g b a l l s t h a t a r e in t h i s frame
p u b l i c L i s t <J u g g l i n g B a l l > b a l l s ;
6
/ / p o s i t i o n o f t h e l e f t hand
public Vector2 LHPosition ;
8
/ / p o s i t i o n o f t h e r i g h t hand
publ ic Vector2 RHPosition ;
10
12
/ / t i m e t h i s f r a m e was c r e a t e d
public double time ;
14
....
16
}
src/Frame.c
Comparing each Frame created with the previous Frame that was seen allows the program to calculate various
extra properties about the flights of the juggling balls that have been detected.
Given that each ball is assigned a color, and the user is juggling with balls that are all different colors, it is
possible to compare a ball in one frame with the correct ball in the previous frame. Given all the information
contained in two sucessive Frames, it is also possible to determine the direction that a ball is traveling in (by
comparing its previous position with its latest position).
The first and second Frames that are added to a FrameList are special cases. In the first Frame, we cannot
calculate the ball directions as there is no previous Frame present, so the program just inserts the Frame to the
FrameList without any extra work. However, once the second Frame is added we can calculate the ball directions
given this new Frame. The program will calculate all the ball directions for the second Frame and also set the
same ball directions in the first Frame.
Once the directions have been assigned to each ball, we have access to a collection of juggling balls with
both color and the direction, and also the position of both of the users hands.
Peak Event Detection
A Peak Event occurs when a balls vertical direction changes (found by comparing the y position in two successive
Frames). When this is detected, a new PeakEvent is fired (which is handled in the main program). A PeakEvent
contains the location of the ball as it changed direction and the time that the event was fired.
26
Throw Event Detection
A ThrowEvent occurs when a ball is traveling vertically up and away from the hand involved in the throw. In
order to detect this event, each hand has an area around it (called the Throw Zone). When checking consecutive
Frames, if a ball is found to be in the Throw Zone (of a hand) in the previous frame, and outside the same throw
zone in the next frame, then a ThrowEvent is fired.
A ThrowEvent contains the position of the ball that was thrown (just as it left the Throw Zone), the hand
involved in the throw and the time the ThrowEvent was fired.
This method of detection does have its flaws as the ball has to be seen twice. Once inside the Throw Zone
and once outside, and the Event will not be fired until the ball has left the ThrowZone (meaning a slight delay in
timing), however it was developed this way to try and improve the programs ability to handle dropped Frames.
Catch Event Detection
A CatchEvent is detected in a similar way to a ThrowEvent. However, for a catch, the ball must be detected
outside the Throw Zone of a hand and then inside the Throw Zone in successive Frames. Similar to a ThrowEvent, a CatchEvent contains the position of the ball that was caught (just as it entered the Throw Zone), the
hand involved in the catch and the time the CatchEvent was fired.
Processing the FrameList in this manner, will lead to a stream of these Events. Given the amount of work
going on in the background to get to the point of detecting Events, it is possible for Frames to be missed. When
a Frame is missed the information required is not retrieved from the Kinect Sensor (as it is still processing the
previous Frame). It is entirely possible that missing certain frames could lead to missed Events.
In order to try and minimize these missed Events, the Throw and Catch Event detection only needs to see the
ball anywhere inside the Throw Zone and anywhere outside the Throw Zone. This means that if some frames
are missed in between, the program will still detect the Events as long as it has one where the ball is inside the
Throw and one where it is outside.
Particularly on slower machines, the program can still miss Events. For example, if a ball is seen outside
the Therow Zone of a hand in one frame, and then by the time the next Frame is seen the ball has been caught,
then the program will have missed the CatchEvent it was supposed to detect. There is very little that can be done
about missing Events in this manner, however once all of the Events for a juggle have been created, they can be
guessed (discussed later).
5.5
Processing Events and Pattern Matching
As the program continually updates, each Frame is created and Events are fired when the program detects a
Throw, Catch or Peak occurring. When a user is finished juggling, the program must use these Events (and the
times that they were created) to calculate how well the user has juggled (for example, using Shannon’s Uniform
Juggling equation).
As discussed previously, it is possible that the program has missed key events during its execution. In order
to deal with this the EventAnalyser will ”guess” what Events have been missed.
27
It is able to do this given that it knows the number of balls and the number of hands that have been used when
the user was juggling. When a user is juggling there is always a pattern to the Events fired which is unique to
the number of balls and method of juggling (i.e cascade or shower). An example of the two ball juggling Event
pattern is given in Figure 5.8, with throws highlighted red, catches highlighted green and peak events highlighted
blue.
Figure 5.8: 2 Ball Event Pattern
It should be noted that in the above pattern, the left hand throws first. If the right hand were to throw first,
the pattern would have opposite hands throughout.
Given that there may be missed Events that the program has not seen, it must create a new guessed Events
and insert them into the list of events so that it follows the pattern for the juggle that is being performed. For
example, if the program detects a right hand throw followed by a left hand throw, then it can assume that it has
missed a peak and a left hand catch event.
28
Chapter 6
Evaluation
29
Chapter 7
Conclusion
30
Appendices
31
Appendix A
Running the Programs
An example of running from the command line is as follows:
> java MaxClique BBMC1 brock200_1.clq 14400
This will apply BBM C with style = 1 to the first brock200 DIMACS instance allowing 14400 seconds of cpu
time.
32
Appendix B
Generating Random Graphs
We generate Erd´os-R¨enyi random graphs G(n, p) where n is the number of vertices and each edge is included in
the graph with probability p independent from every other edge. It produces a random graph in DIMACS format
with vertices numbered 1 to n inclusive. It can be run from the command line as follows to produce a clq file
> java RandomGraph 100 0.9 > 100-90-00.clq
33
Bibliography
[1] Bell
Labs
Advances
Intelligent
Networks.
http://www3.alcatellucent.com/wps/portal/!ut/p/kcxml/04 Sj9SPykssy0xPLMnMz0vM0Y QjzKLd4w39w3RL8h2VAQAGOJBYA!!?LMS
Feature Detail 000025.
[2] Eric W. Aboaf. Task-level robot learning: juggling a tennis ball more accurately. Robotics and Automation,
3:1290–1295, 1989.
[3] Rob Abram. Wildcat Juggling Tutorials. http://wildcatjugglers.com/tutorial/category/index.html.
[4] Peter J Beek and Arthur Lewbel. Science of Juggling. https://www2.bc.edu/ ˜lewbel/jugweb/sciamjug.pdf.
[5] J Canny. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell., 8(6):679–
698, June 1986.
[6] Google Tech Talks David Sachs. Sensor Fusion on Android Devices: A Revolution in Motion Processing
(approx 23:30). http://www.youtube.com/watch?v=C7JQ7Rpwn2k.
[7] Bill Donahue.
Jugglers Now Juggle Numbers to Compute New Tricks for Ancient Art.
http://www.nytimes.com/1996/04/16/science/jugglers-now-juggle-numbers-to-compute-new-tricksfor-ancient-art.html.
[8] Billy Gillen. Remember the Force Hassan! http://www.jstor.org/stable/10.2307/1412713.
[9] Leah
Goodman.
Using
quantum
numbers
to
describe
juggling
https://play.google.com/store/apps/details?id=com.dreamstep.wLearntoJuggleClubs&hl=en.
patterns.
[10] Google. Android Sensors Overview. http://developer.android.com/guide/topics/sensors/sensors overview.html.
[11] D.F. Hayes, T. Shubin, G.L. Alexanderson, and P. Ross. Mathematical Adventures For Students and Amateurs. Spectrum Series. Math. Assoc. of America, 2004.
[12] Intel. Open CV Library. http://opencv.willowgarage.com/wiki/.
[13] Paul Klimek. Using quantum numbers to describe juggling patterns. http://www.quantumjuggling.com/.
[14] Denis Mottet Marina Vela Nuez, Carlo Avizzano and Massimo Bergamasco. A cost-effective sensor system
to train light weight juggling using an interactive virtual reality interface. BIO Web of Conferences, The
International Conference SKILLS 2011, 1:6, 2011.
[15] S. McKechnie. Popular Entertainments Through the Ages. Frederick A. Stokes Company, 1931.
[16] Microsoft.
Kinect for Windows Software Development Kit.
us/kinectforwindows/develop/developer-downloads.aspx.
http://www.microsoft.com/en-
[17] Microsoft Developer Network.
Kinect for Windows Sensor Components and Specifications.
http://msdn.microsoft.com/en-us/library/jj131033.aspx.
34
[18] Microsoft Developer Network.
us/library/jj663803.aspx#SDK 1pt6 M2.
[19] Microsoft Developer Network.
us/library/jj883692.aspx.
Kinect
SDK
v1.6.
MapDepthPointToColorPoint Method.
[20] Microsoft Developer Network.
.NET
gb/library/system.diagnostics.stopwatch.aspx.
Stopwatch
Class.
[21] Microsoft Developers Network.
Microsoft Speech Platform.
us/library/hh361572(v=office.14).aspx.
http://msdn.microsoft.com/enhttp://msdn.microsoft.com/enhttp://msdn.microsoft.com/enhttp://msdn.microsoft.com/en-
[22] Per Nielsen. Juggling Festivals From Scandinavia. http://www.juggle.org/history/archives/jugmags/383/38-3,p28.htm.
[23] C.E. Shannon, N.J.A. Sloane, A.D. Wyner, and IEEE Information Theory Society. Claude Elwood Shannon: collected papers. IEEE Press, 1993.
[24] Platty Soft. Juggle Droid Lite. https://play.google.com/store/apps/details?id=net.shalafi.android.juggle&hl=en.
[25] Edgar James Swift. Studies in the Psychology and Physiology of Learning. The American Journal of
Psychology, 14:201–224, 1903.
[26] TWJC. Tunbridge Wells Juggling Club Tutorials. http://www.twjc.co.uk/tutorials.html.
[27] Walaber. WiiSticks Project. http://walaber.com/wordpress/?p=72.
35